2025-08-14T21:18:58.1115764Z Current runner version: '2.328.0' 2025-08-14T21:18:58.1122295Z Runner name: 'i-006ff0d0e48756a9a' 2025-08-14T21:18:58.1123074Z Runner group name: 'default' 2025-08-14T21:18:58.1123931Z Machine name: 'ip-10-0-74-8' 2025-08-14T21:18:58.1126865Z ##[group]GITHUB_TOKEN Permissions 2025-08-14T21:18:58.1129191Z Contents: read 2025-08-14T21:18:58.1129709Z Metadata: read 2025-08-14T21:18:58.1130193Z ##[endgroup] 2025-08-14T21:18:58.1132497Z Secret source: Actions 2025-08-14T21:18:58.1133162Z Prepare workflow directory 2025-08-14T21:18:58.1632741Z Prepare all required actions 2025-08-14T21:18:58.1669843Z Getting action download info 2025-08-14T21:18:58.4617649Z Download action repository 'pytorch/test-infra@main' (SHA:83f58f391e939c10dcb8cb6d745e4cefa3b98a84) 2025-08-14T21:18:59.9976012Z Download action repository 'pytorch/pytorch@main' (SHA:3be70dc30e893b552fc0f23ca06cd8f7949b6d08) 2025-08-14T21:19:15.1783121Z Download action repository 'actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065' (SHA:a26af69be951a213d495a4c3e4e4022e16d87065) 2025-08-14T21:19:15.5479106Z Download action repository 'aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722' (SHA:ececac1a45f3b08a01d2dd070d28d111c5fe6722) 2025-08-14T21:19:15.8139977Z Download action repository 'aws-actions/amazon-ecr-login@062b18b96a7aff071d4dc91bc00c4c1a7945b076' (SHA:062b18b96a7aff071d4dc91bc00c4c1a7945b076) 2025-08-14T21:19:15.9679624Z Download action repository 'seemethere/upload-artifact-s3@baba72d0712b404f646cebe0730933554ebce96a' (SHA:baba72d0712b404f646cebe0730933554ebce96a) 2025-08-14T21:19:16.3430845Z Getting action download info 2025-08-14T21:19:16.4486407Z Download action repository 'actions/checkout@v4' (SHA:08eba0b27e820071cde6df949e0beb9ba4906955) 2025-08-14T21:19:16.7274858Z Getting action download info 2025-08-14T21:19:16.8561516Z Download action repository 'nick-fields/retry@v3.0.0' (SHA:7152eba30c6575329ac0576536151aca5a72780e) 2025-08-14T21:19:17.1205935Z Getting action download info 2025-08-14T21:19:17.2770835Z Download action repository 'nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482' (SHA:3e91a01664abd3c5cd539100d10d33b9c5b68482) 2025-08-14T21:19:17.4583152Z Getting action download info 2025-08-14T21:19:17.6559521Z Uses: pytorch/pytorch/.github/workflows/_linux-test.yml@refs/heads/main (1fc683cf17c8c673044538d10266c00f92987be2) 2025-08-14T21:19:17.6563424Z ##[group] Inputs 2025-08-14T21:19:17.6563800Z build-environment: linux-jammy-cuda12.8-py3.10-gcc11-sm86 2025-08-14T21:19:17.6565021Z test-matrix: {"include": [{"config": "slow", "shard": 1, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}, {"config": "slow", "shard": 2, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}, {"config": "slow", "shard": 3, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}]} 2025-08-14T21:19:17.6566482Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe 2025-08-14T21:19:17.6567288Z sync-tag: 2025-08-14T21:19:17.6568021Z timeout-minutes: 240 2025-08-14T21:19:17.6568277Z use-gha: 2025-08-14T21:19:17.6568487Z dashboard-tag: 2025-08-14T21:19:17.6568737Z s3-bucket: gha-artifacts 2025-08-14T21:19:17.6569005Z aws-role-to-assume: 2025-08-14T21:19:17.6569524Z disable-monitor: false 2025-08-14T21:19:17.6569809Z monitor-log-interval: 5 2025-08-14T21:19:17.6570107Z monitor-data-collect-interval: 1 2025-08-14T21:19:17.6570417Z ##[endgroup] 2025-08-14T21:19:17.6570890Z Complete job name: linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 3, 3, linux.g5.4xlarge.nvidia.gpu) 2025-08-14T21:19:17.7199414Z A job started hook has been configured by the self-hosted runner administrator 2025-08-14T21:19:17.7301910Z ##[group]Run '/home/ec2-user/runner-scripts/before_job.sh' 2025-08-14T21:19:17.7313142Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:19:17.7313717Z ##[endgroup] 2025-08-14T21:19:19.2068029Z Runner Type: linux.g5.4xlarge.nvidia.gpu 2025-08-14T21:19:19.2068967Z Instance Type: g5.4xlarge 2025-08-14T21:19:19.2069225Z AMI Name: unknown 2025-08-14T21:19:19.2111559Z AMI ID: ami-05ffe3c48a9991133 2025-08-14T21:19:24.7062440Z ##[group]Run pytorch/test-infra/.github/actions/setup-ssh@main 2025-08-14T21:19:24.7062839Z with: 2025-08-14T21:19:24.7063335Z github-secret: *** 2025-08-14T21:19:24.7063988Z instructions: All testing is done inside the container, to start an interactive session run: docker exec -it $(docker container ps --format '{{.ID}}') bash 2025-08-14T21:19:24.7064689Z activate-with-label: false 2025-08-14T21:19:24.7064958Z label: with-ssh 2025-08-14T21:19:24.7065196Z remove-existing-keys: true 2025-08-14T21:19:24.7065465Z fail-silently: true 2025-08-14T21:19:24.7065693Z env: 2025-08-14T21:19:24.7065910Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:19:24.7066165Z ##[endgroup] 2025-08-14T21:19:24.8459142Z Please see https://github.com/pytorch/pytorch/wiki/Debugging-using-with-ssh-for-Github-Actions for more info. 2025-08-14T21:19:24.8460727Z Not on pull request and ciflow reference could not be extracted, skipping adding ssh keys 2025-08-14T21:19:24.8654462Z ##[group]Run pytorch/pytorch/.github/actions/checkout-pytorch@main 2025-08-14T21:19:24.8654877Z with: 2025-08-14T21:19:24.8655089Z no-sudo: true 2025-08-14T21:19:24.8655329Z submodules: recursive 2025-08-14T21:19:24.8655594Z fetch-depth: 0 2025-08-14T21:19:24.8655814Z env: 2025-08-14T21:19:24.8656033Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:19:24.8656292Z ##[endgroup] 2025-08-14T21:19:24.8758754Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-08-14T21:19:24.8759653Z echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-08-14T21:19:24.8774237Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:19:24.8774614Z env: 2025-08-14T21:19:24.8774851Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:19:24.8775154Z ##[endgroup] 2025-08-14T21:19:24.8876287Z ##[group]Run # Use all available CPUs for fetching 2025-08-14T21:19:24.8876712Z # Use all available CPUs for fetching 2025-08-14T21:19:24.8877052Z cd "${GITHUB_WORKSPACE}" 2025-08-14T21:19:24.8877390Z git config --global fetch.parallel 0 2025-08-14T21:19:24.8877774Z git config --global submodule.fetchJobs 0 2025-08-14T21:19:24.8878113Z  2025-08-14T21:19:24.8878470Z # Clean workspace. The default checkout action should also do this, but 2025-08-14T21:19:24.8878923Z # do it here as well just in case 2025-08-14T21:19:24.8879248Z if [[ -d .git ]]; then 2025-08-14T21:19:24.8879542Z  if [ -z "${NO_SUDO}" ]; then 2025-08-14T21:19:24.8879864Z  sudo git clean -ffdx 2025-08-14T21:19:24.8880165Z  else 2025-08-14T21:19:24.8880407Z  git clean -ffdx 2025-08-14T21:19:24.8880688Z  fi 2025-08-14T21:19:24.8880914Z fi 2025-08-14T21:19:24.8890215Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:19:24.8890600Z env: 2025-08-14T21:19:24.8890891Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:19:24.8891176Z NO_SUDO: true 2025-08-14T21:19:24.8891405Z ##[endgroup] 2025-08-14T21:19:24.9057182Z ##[group]Run actions/checkout@v4 2025-08-14T21:19:24.9057475Z with: 2025-08-14T21:19:24.9057721Z ref: 1fc683cf17c8c673044538d10266c00f92987be2 2025-08-14T21:19:24.9058033Z fetch-depth: 0 2025-08-14T21:19:24.9058271Z submodules: recursive 2025-08-14T21:19:24.9058529Z show-progress: false 2025-08-14T21:19:24.9058789Z repository: pytorch/pytorch 2025-08-14T21:19:24.9059194Z token: *** 2025-08-14T21:19:24.9059426Z ssh-strict: true 2025-08-14T21:19:24.9059658Z ssh-user: git 2025-08-14T21:19:24.9059894Z persist-credentials: true 2025-08-14T21:19:24.9060157Z clean: true 2025-08-14T21:19:24.9060409Z sparse-checkout-cone-mode: true 2025-08-14T21:19:24.9060694Z fetch-tags: false 2025-08-14T21:19:24.9061147Z lfs: false 2025-08-14T21:19:24.9061376Z set-safe-directory: true 2025-08-14T21:19:24.9061640Z env: 2025-08-14T21:19:24.9061851Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:19:24.9062110Z ##[endgroup] 2025-08-14T21:19:25.0212014Z Syncing repository: pytorch/pytorch 2025-08-14T21:19:25.0213302Z ##[group]Getting Git version info 2025-08-14T21:19:25.0213749Z Working directory is '/home/ec2-user/actions-runner/_work/pytorch/pytorch' 2025-08-14T21:19:25.0214376Z [command]/usr/bin/git version 2025-08-14T21:19:25.0405033Z git version 2.47.1 2025-08-14T21:19:25.0440153Z ##[endgroup] 2025-08-14T21:19:25.0451689Z Copying '/home/ec2-user/.gitconfig' to '/home/ec2-user/actions-runner/_work/_temp/bb59b407-ad87-4f4e-a58f-5a0b2a602254/.gitconfig' 2025-08-14T21:19:25.0472297Z Temporarily overriding HOME='/home/ec2-user/actions-runner/_work/_temp/bb59b407-ad87-4f4e-a58f-5a0b2a602254' before making global git config changes 2025-08-14T21:19:25.0473190Z Adding repository directory to the temporary git global config as a safe directory 2025-08-14T21:19:25.0486529Z [command]/usr/bin/git config --global --add safe.directory /home/ec2-user/actions-runner/_work/pytorch/pytorch 2025-08-14T21:19:25.0540869Z Deleting the contents of '/home/ec2-user/actions-runner/_work/pytorch/pytorch' 2025-08-14T21:19:25.0544423Z ##[group]Initializing the repository 2025-08-14T21:19:25.0549019Z [command]/usr/bin/git init /home/ec2-user/actions-runner/_work/pytorch/pytorch 2025-08-14T21:19:25.0615796Z hint: Using 'master' as the name for the initial branch. This default branch name 2025-08-14T21:19:25.0616394Z hint: is subject to change. To configure the initial branch name to use in all 2025-08-14T21:19:25.0616931Z hint: of your new repositories, which will suppress this warning, call: 2025-08-14T21:19:25.0617322Z hint: 2025-08-14T21:19:25.0617620Z hint: git config --global init.defaultBranch 2025-08-14T21:19:25.0617965Z hint: 2025-08-14T21:19:25.0618291Z hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and 2025-08-14T21:19:25.0618832Z hint: 'development'. The just-created branch can be renamed via this command: 2025-08-14T21:19:25.0619240Z hint: 2025-08-14T21:19:25.0619464Z hint: git branch -m 2025-08-14T21:19:25.0637461Z Initialized empty Git repository in /home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/ 2025-08-14T21:19:25.0650348Z [command]/usr/bin/git remote add origin https://github.com/pytorch/pytorch 2025-08-14T21:19:25.0699055Z ##[endgroup] 2025-08-14T21:19:25.0699753Z ##[group]Disabling automatic garbage collection 2025-08-14T21:19:25.0703791Z [command]/usr/bin/git config --local gc.auto 0 2025-08-14T21:19:25.0738406Z ##[endgroup] 2025-08-14T21:19:25.0739033Z ##[group]Setting up auth 2025-08-14T21:19:25.0745675Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-08-14T21:19:25.0781737Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-08-14T21:19:25.1207300Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-08-14T21:19:25.1240813Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-08-14T21:19:25.1653609Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-08-14T21:19:25.1708990Z ##[endgroup] 2025-08-14T21:19:25.1709547Z ##[group]Fetching the repository 2025-08-14T21:19:25.1716849Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2025-08-14T21:20:14.4701532Z From https://github.com/pytorch/pytorch 2025-08-14T21:20:14.4702348Z * [new branch] 2.6.0.dev20241004+ -> origin/2.6.0.dev20241004+ 2025-08-14T21:20:14.4703072Z * [new branch] 5addvllmbuild -> origin/5addvllmbuild 2025-08-14T21:20:14.4707570Z * [new branch] AaronWang04_addmmfusion_perftest -> origin/AaronWang04_addmmfusion_perftest 2025-08-14T21:20:14.4708257Z * [new branch] HDCharles-2.6.0-release-notes -> origin/HDCharles-2.6.0-release-notes 2025-08-14T21:20:14.4708944Z * [new branch] JackCaoG/dynamo_make_fx_non_core_aten_ops -> origin/JackCaoG/dynamo_make_fx_non_core_aten_ops 2025-08-14T21:20:14.4711225Z * [new branch] PR-AOTInductorNoneBug -> origin/PR-AOTInductorNoneBug 2025-08-14T21:20:14.4713480Z * [new branch] PR-AOTInductorNoneBugFix -> origin/PR-AOTInductorNoneBugFix 2025-08-14T21:20:14.4715288Z * [new branch] PR-FixConfigsIssue -> origin/PR-FixConfigsIssue 2025-08-14T21:20:14.4717094Z * [new branch] PR-NoneBugFix-viable -> origin/PR-NoneBugFix-viable 2025-08-14T21:20:14.4718999Z * [new branch] PR-ResetToZero -> origin/PR-ResetToZero 2025-08-14T21:20:14.4721122Z * [new branch] Update-Flash-Packaging -> origin/Update-Flash-Packaging 2025-08-14T21:20:14.4722762Z * [new branch] add-missing-args-normalization -> origin/add-missing-args-normalization 2025-08-14T21:20:14.4724836Z * [new branch] add-user-guide-structure -> origin/add-user-guide-structure 2025-08-14T21:20:14.4726793Z * [new branch] addVllmPin -> origin/addVllmPin 2025-08-14T21:20:14.4728830Z * [new branch] add_windows_testing_back -> origin/add_windows_testing_back 2025-08-14T21:20:14.4730866Z * [new branch] addbuildvllm -> origin/addbuildvllm 2025-08-14T21:20:14.4733201Z * [new branch] addmm-heuristic -> origin/addmm-heuristic 2025-08-14T21:20:14.4735096Z * [new branch] addsimde -> origin/addsimde 2025-08-14T21:20:14.4737031Z * [new branch] addvllpinnedfile -> origin/addvllpinnedfile 2025-08-14T21:20:14.4758502Z * [new branch] adi/acl_upgrade -> origin/adi/acl_upgrade 2025-08-14T21:20:14.4759109Z * [new branch] adi/skip_slow_tests -> origin/adi/skip_slow_tests 2025-08-14T21:20:14.4759570Z * [new branch] adi/test -> origin/adi/test 2025-08-14T21:20:14.4760102Z * [new branch] adi/test_bgemm -> origin/adi/test_bgemm 2025-08-14T21:20:14.4760566Z * [new branch] adi/test_fusions -> origin/adi/test_fusions 2025-08-14T21:20:14.4761057Z * [new branch] adi/test_onednn_v3.9 -> origin/adi/test_onednn_v3.9 2025-08-14T21:20:14.4761558Z * [new branch] adi/test_presve_change -> origin/adi/test_presve_change 2025-08-14T21:20:14.4762039Z * [new branch] adi/test_timm -> origin/adi/test_timm 2025-08-14T21:20:14.4762527Z * [new branch] adi/testpresve_change -> origin/adi/testpresve_change 2025-08-14T21:20:14.4763123Z * [new branch] aditew01/test/vec_bf16 -> origin/aditew01/test/vec_bf16 2025-08-14T21:20:14.4763676Z * [new branch] ah-globalfeedback-hook -> origin/ah-globalfeedback-hook 2025-08-14T21:20:14.4764211Z * [new branch] albanD-patch-1 -> origin/albanD-patch-1 2025-08-14T21:20:14.4764773Z * [new branch] alt-disable -> origin/alt-disable 2025-08-14T21:20:14.4768004Z * [new branch] angelayi/aoti_additional_files -> origin/angelayi/aoti_additional_files 2025-08-14T21:20:14.4769872Z * [new branch] angelayi/aoti_inductor_fx -> origin/angelayi/aoti_inductor_fx 2025-08-14T21:20:14.4772102Z * [new branch] angelayi/assert_tensor_metadata_device -> origin/angelayi/assert_tensor_metadata_device 2025-08-14T21:20:14.4774104Z * [new branch] angelayi/benchmark -> origin/angelayi/benchmark 2025-08-14T21:20:14.4776398Z * [new branch] angelayi/benchmark2 -> origin/angelayi/benchmark2 2025-08-14T21:20:14.4779072Z * [new branch] angelayi/change_pytree_serialization -> origin/angelayi/change_pytree_serialization 2025-08-14T21:20:14.4781624Z * [new branch] angelayi/cpp_loader -> origin/angelayi/cpp_loader 2025-08-14T21:20:14.4784470Z * [new branch] angelayi/custom_op_subgraph -> origin/angelayi/custom_op_subgraph 2025-08-14T21:20:14.4786644Z * [new branch] angelayi/customop -> origin/angelayi/customop 2025-08-14T21:20:14.4788891Z * [new branch] angelayi/del_lib -> origin/angelayi/del_lib 2025-08-14T21:20:14.4790949Z * [new branch] angelayi/docs -> origin/angelayi/docs 2025-08-14T21:20:14.4793167Z * [new branch] angelayi/docs2 -> origin/angelayi/docs2 2025-08-14T21:20:14.4795425Z * [new branch] angelayi/fix_pt2 -> origin/angelayi/fix_pt2 2025-08-14T21:20:14.4797852Z * [new branch] angelayi/logging.bak -> origin/angelayi/logging.bak 2025-08-14T21:20:14.4799854Z * [new branch] angelayi/logging2 -> origin/angelayi/logging2 2025-08-14T21:20:14.4801990Z * [new branch] angelayi/no_so_weight -> origin/angelayi/no_so_weight 2025-08-14T21:20:14.4804216Z * [new branch] angelayi/pytree -> origin/angelayi/pytree 2025-08-14T21:20:14.4806509Z * [new branch] angelayi/save_error -> origin/angelayi/save_error 2025-08-14T21:20:14.4808739Z * [new branch] angelayi/scan_layers -> origin/angelayi/scan_layers 2025-08-14T21:20:14.4811003Z * [new branch] angelayi/symint_input -> origin/angelayi/symint_input 2025-08-14T21:20:14.4813368Z * [new branch] angelayi/tensor_nn_module_meta -> origin/angelayi/tensor_nn_module_meta 2025-08-14T21:20:14.4815454Z * [new branch] angelayi/torch_size -> origin/angelayi/torch_size 2025-08-14T21:20:14.4817591Z * [new branch] aoti-cuda-alloc -> origin/aoti-cuda-alloc 2025-08-14T21:20:14.4819773Z * [new branch] aoti_weight_sharing -> origin/aoti_weight_sharing 2025-08-14T21:20:14.4822675Z * [new branch] arsh/symint_mm_ind_decomp -> origin/arsh/symint_mm_ind_decomp 2025-08-14T21:20:14.4825001Z * [new branch] atalman-inductor-perf-cu124 -> origin/atalman-inductor-perf-cu124 2025-08-14T21:20:14.4827256Z * [new branch] atalman-inductor-perf-cu124.1 -> origin/atalman-inductor-perf-cu124.1 2025-08-14T21:20:14.4829413Z * [new branch] atalman-patch-1 -> origin/atalman-patch-1 2025-08-14T21:20:14.4831659Z * [new branch] atalman-patch-2 -> origin/atalman-patch-2 2025-08-14T21:20:14.4833962Z * [new branch] atalman-patch-3 -> origin/atalman-patch-3 2025-08-14T21:20:14.4836092Z * [new branch] atalman-patch-6 -> origin/atalman-patch-6 2025-08-14T21:20:14.4838235Z * [new branch] atalman-patch-7 -> origin/atalman-patch-7 2025-08-14T21:20:14.4840443Z * [new branch] atalman-patch-8 -> origin/atalman-patch-8 2025-08-14T21:20:14.4842686Z * [new branch] atalman_inductor_2.3.0 -> origin/atalman_inductor_2.3.0 2025-08-14T21:20:14.4844962Z * [new branch] atalman_inductor_2.3.1 -> origin/atalman_inductor_2.3.1 2025-08-14T21:20:14.4847372Z * [new branch] atalman_inductor_2.4.0 -> origin/atalman_inductor_2.4.0 2025-08-14T21:20:14.4849682Z * [new branch] atalman_inductor_2.4.x -> origin/atalman_inductor_2.4.x 2025-08-14T21:20:14.4852163Z * [new branch] autoupdate-transformers-pin-via-pr -> origin/autoupdate-transformers-pin-via-pr 2025-08-14T21:20:14.4854062Z * [new branch] backupvllm -> origin/backupvllm 2025-08-14T21:20:14.4857189Z * [new branch] base/1.5 -> origin/base/1.5 2025-08-14T21:20:14.4859516Z * [new branch] batching_sdpa_efficient_attention -> origin/batching_sdpa_efficient_attention 2025-08-14T21:20:14.4861568Z * [new branch] benchmark-updates -> origin/benchmark-updates 2025-08-14T21:20:14.4863875Z * [new branch] benchmarking-script -> origin/benchmarking-script 2025-08-14T21:20:14.4866983Z * [new branch] benjaminglass1/mark-large-tensor-tests-serial -> origin/benjaminglass1/mark-large-tensor-tests-serial 2025-08-14T21:20:14.4869723Z * [new branch] bertmaher/pinbump26 -> origin/bertmaher/pinbump26 2025-08-14T21:20:14.4872587Z * [new branch] bertrand/cutlass -> origin/bertrand/cutlass 2025-08-14T21:20:14.4875458Z * [new branch] bf/cg-log -> origin/bf/cg-log 2025-08-14T21:20:14.4877569Z * [new branch] bf/cg-remove-check -> origin/bf/cg-remove-check 2025-08-14T21:20:14.4879714Z * [new branch] bf/cg-skip-1-kernel -> origin/bf/cg-skip-1-kernel 2025-08-14T21:20:14.4881714Z * [new branch] bf/cudagraph -> origin/bf/cudagraph 2025-08-14T21:20:14.4883768Z * [new branch] bf/cudagraph-disable-input-mutation -> origin/bf/cudagraph-disable-input-mutation 2025-08-14T21:20:14.4886604Z * [new branch] bf/cudagraph-enable-input-mutation-support-benchmark -> origin/bf/cudagraph-enable-input-mutation-support-benchmark 2025-08-14T21:20:14.4888862Z * [new branch] bf/cudagraph-partition -> origin/bf/cudagraph-partition 2025-08-14T21:20:14.4890920Z * [new branch] bf/default-recompile-reason -> origin/bf/default-recompile-reason 2025-08-14T21:20:14.4893086Z * [new branch] bf/donated-buffer-bench -> origin/bf/donated-buffer-bench 2025-08-14T21:20:14.4895167Z * [new branch] bf/improve-kernel-bench -> origin/bf/improve-kernel-bench 2025-08-14T21:20:14.4897426Z * [new branch] bf/kernel-benchmark -> origin/bf/kernel-benchmark 2025-08-14T21:20:14.4899523Z * [new branch] bf/partition-doc -> origin/bf/partition-doc 2025-08-14T21:20:14.4901736Z * [new branch] bf/partition-move-cpu -> origin/bf/partition-move-cpu 2025-08-14T21:20:14.4903909Z * [new branch] bf/partition-turn-on -> origin/bf/partition-turn-on 2025-08-14T21:20:14.4906011Z * [new branch] bf/remove-check-55b0c39d -> origin/bf/remove-check-55b0c39d 2025-08-14T21:20:14.4908072Z * [new branch] bf/skip-asserts -> origin/bf/skip-asserts 2025-08-14T21:20:14.4910146Z * [new branch] bf16adamw -> origin/bf16adamw 2025-08-14T21:20:14.4912391Z * [new branch] bisect_perf_hf_T5_3acc6eac492 -> origin/bisect_perf_hf_T5_3acc6eac492 2025-08-14T21:20:14.4914553Z * [new branch] bisect_perf_hf_T5_3fcf66f61fb -> origin/bisect_perf_hf_T5_3fcf66f61fb 2025-08-14T21:20:14.4916712Z * [new branch] bisect_perf_hf_T5_4009d154129 -> origin/bisect_perf_hf_T5_4009d154129 2025-08-14T21:20:14.4918808Z * [new branch] bisect_perf_hf_T5_40d0740e73d -> origin/bisect_perf_hf_T5_40d0740e73d 2025-08-14T21:20:14.4920859Z * [new branch] bisect_perf_hf_T5_5268754e -> origin/bisect_perf_hf_T5_5268754e 2025-08-14T21:20:14.4922943Z * [new branch] bisect_perf_hf_T5_7d89a8d385c -> origin/bisect_perf_hf_T5_7d89a8d385c 2025-08-14T21:20:14.4925115Z * [new branch] bisect_perf_hf_T5_b7a25c1ee7c -> origin/bisect_perf_hf_T5_b7a25c1ee7c 2025-08-14T21:20:14.4927208Z * [new branch] bisect_perf_hf_T5_c25b201583f -> origin/bisect_perf_hf_T5_c25b201583f 2025-08-14T21:20:14.4929247Z * [new branch] bisect_perf_hf_T5_c93e57efac0 -> origin/bisect_perf_hf_T5_c93e57efac0 2025-08-14T21:20:14.4931368Z * [new branch] bisect_perf_hf_T5_ca9813ea149 -> origin/bisect_perf_hf_T5_ca9813ea149 2025-08-14T21:20:14.4933460Z * [new branch] bisect_perf_hf_T5_d65f194a -> origin/bisect_perf_hf_T5_d65f194a 2025-08-14T21:20:14.4935531Z * [new branch] bisect_perf_hf_T5_da94ab0b -> origin/bisect_perf_hf_T5_da94ab0b 2025-08-14T21:20:14.4937651Z * [new branch] bisect_perf_hf_T5_da94ab0b_new -> origin/bisect_perf_hf_T5_da94ab0b_new 2025-08-14T21:20:14.4939772Z * [new branch] bisect_perf_hf_T5_db4e8a1d8a8 -> origin/bisect_perf_hf_T5_db4e8a1d8a8 2025-08-14T21:20:14.4941855Z * [new branch] bisect_perf_hf_T5_e0d97e936a2 -> origin/bisect_perf_hf_T5_e0d97e936a2 2025-08-14T21:20:14.4944061Z * [new branch] bisect_perf_hf_T5_f23621ec563 -> origin/bisect_perf_hf_T5_f23621ec563 2025-08-14T21:20:14.4947287Z * [new branch] bowbao/bench_updates_stage -> origin/bowbao/bench_updates_stage 2025-08-14T21:20:14.4949416Z * [new branch] bowbao/dort_rewriter -> origin/bowbao/dort_rewriter 2025-08-14T21:20:14.4951541Z * [new branch] bowbao/wip_prs -> origin/bowbao/wip_prs 2025-08-14T21:20:14.4954531Z * [new branch] bowenbao/partial_min_max_reduce -> origin/bowenbao/partial_min_max_reduce 2025-08-14T21:20:14.4957212Z * [new branch] brister/always_wrapper_ir -> origin/brister/always_wrapper_ir 2025-08-14T21:20:14.4959268Z * [new branch] brister/flatten_contig -> origin/brister/flatten_contig 2025-08-14T21:20:14.4961254Z * [new branch] brister/test_block_ptr_same -> origin/brister/test_block_ptr_same 2025-08-14T21:20:14.4963462Z * [new branch] brister/tiled_reduction_no_numel_check -> origin/brister/tiled_reduction_no_numel_check 2025-08-14T21:20:14.4965818Z * [new branch] c57382a49 -> origin/c57382a49 2025-08-14T21:20:14.4967860Z * [new branch] ca_0431d47eaa -> origin/ca_0431d47eaa 2025-08-14T21:20:14.4969931Z * [new branch] ca_fix_0431d47eaa -> origin/ca_fix_0431d47eaa 2025-08-14T21:20:14.4973254Z * [new branch] camyll/revert-94bc900da97ad7f3c35b3b819bb53b23c74b581a-for-release-2.8 -> origin/camyll/revert-94bc900da97ad7f3c35b3b819bb53b23c74b581a-for-release-2.8 2025-08-14T21:20:14.4974756Z * [new branch] camyll/test_precommit_hooks_lintrunner -> origin/camyll/test_precommit_hooks_lintrunner 2025-08-14T21:20:14.4977861Z * [new branch] camyllh/cherrypick-151547-for-release28 -> origin/camyllh/cherrypick-151547-for-release28 2025-08-14T21:20:14.4979827Z * [new branch] camyllh/test_setup_hooks_push -> origin/camyllh/test_setup_hooks_push 2025-08-14T21:20:14.4982099Z * [new branch] cherry-pick-149654-by-pytorch_bot_bot_ -> origin/cherry-pick-149654-by-pytorch_bot_bot_ 2025-08-14T21:20:14.4984302Z * [new branch] cherry-pick-151939-by-pytorch_bot_bot_ -> origin/cherry-pick-151939-by-pytorch_bot_bot_ 2025-08-14T21:20:14.4986558Z * [new branch] cherry-pick-154174-by-pytorch_bot_bot_ -> origin/cherry-pick-154174-by-pytorch_bot_bot_ 2025-08-14T21:20:14.4989050Z * [new branch] cherry-pick-155896-by-pytorch_bot_bot_ -> origin/cherry-pick-155896-by-pytorch_bot_bot_ 2025-08-14T21:20:14.4991141Z * [new branch] cherry-pick-156260-by-pytorch_bot_bot_ -> origin/cherry-pick-156260-by-pytorch_bot_bot_ 2025-08-14T21:20:14.4993267Z * [new branch] cherry-pick-156719-by-pytorch_bot_bot_ -> origin/cherry-pick-156719-by-pytorch_bot_bot_ 2025-08-14T21:20:14.4995434Z * [new branch] cherry-pick-156876-by-pytorch_bot_bot_ -> origin/cherry-pick-156876-by-pytorch_bot_bot_ 2025-08-14T21:20:14.4997907Z * [new branch] cherry-pick-156888-by-pytorch_bot_bot_ -> origin/cherry-pick-156888-by-pytorch_bot_bot_ 2025-08-14T21:20:14.5000087Z * [new branch] cherry-pick-157014-by-pytorch_bot_bot_ -> origin/cherry-pick-157014-by-pytorch_bot_bot_ 2025-08-14T21:20:14.5002342Z * [new branch] cherry-pick-157179-by-pytorch_bot_bot_ -> origin/cherry-pick-157179-by-pytorch_bot_bot_ 2025-08-14T21:20:14.5004467Z * [new branch] cherry-pick-157453-by-pytorch_bot_bot_ -> origin/cherry-pick-157453-by-pytorch_bot_bot_ 2025-08-14T21:20:14.5006709Z * [new branch] cherry-pick-157513-by-pytorch_bot_bot_ -> origin/cherry-pick-157513-by-pytorch_bot_bot_ 2025-08-14T21:20:14.5008863Z * [new branch] cherry-pick-157558-by-pytorch_bot_bot_ -> origin/cherry-pick-157558-by-pytorch_bot_bot_ 2025-08-14T21:20:14.5010958Z * [new branch] cherry-pick-157598-by-pytorch_bot_bot_ -> origin/cherry-pick-157598-by-pytorch_bot_bot_ 2025-08-14T21:20:14.5013212Z * [new branch] cherry-pick-157600-by-pytorch_bot_bot_ -> origin/cherry-pick-157600-by-pytorch_bot_bot_ 2025-08-14T21:20:14.5015469Z * [new branch] cherry-pick-157630-by-pytorch_bot_bot_ -> origin/cherry-pick-157630-by-pytorch_bot_bot_ 2025-08-14T21:20:14.5017773Z * [new branch] cherry-pick-157695-by-pytorch_bot_bot_ -> origin/cherry-pick-157695-by-pytorch_bot_bot_ 2025-08-14T21:20:14.5019851Z * [new branch] cherry-pick-157732-by-pytorch_bot_bot_ -> origin/cherry-pick-157732-by-pytorch_bot_bot_ 2025-08-14T21:20:14.5022065Z * [new branch] cherry-pick-157733-by-pytorch_bot_bot_ -> origin/cherry-pick-157733-by-pytorch_bot_bot_ 2025-08-14T21:20:14.5024235Z * [new branch] cherry-pick-157985-by-pytorch_bot_bot_ -> origin/cherry-pick-157985-by-pytorch_bot_bot_ 2025-08-14T21:20:14.5026371Z * [new branch] cherry-pick-157993-by-pytorch_bot_bot_ -> origin/cherry-pick-157993-by-pytorch_bot_bot_ 2025-08-14T21:20:14.5028696Z * [new branch] cherry-pick-158064-by-pytorch_bot_bot_ -> origin/cherry-pick-158064-by-pytorch_bot_bot_ 2025-08-14T21:20:14.5030909Z * [new branch] cherry-pick-158152-by-pytorch_bot_bot_ -> origin/cherry-pick-158152-by-pytorch_bot_bot_ 2025-08-14T21:20:14.5033065Z * [new branch] cherry-pick-158295-by-pytorch_bot_bot_ -> origin/cherry-pick-158295-by-pytorch_bot_bot_ 2025-08-14T21:20:14.5035307Z * [new branch] cherry-pick-158301-by-pytorch_bot_bot_ -> origin/cherry-pick-158301-by-pytorch_bot_bot_ 2025-08-14T21:20:14.5037479Z * [new branch] cherry-pick-158537-by-pytorch_bot_bot_ -> origin/cherry-pick-158537-by-pytorch_bot_bot_ 2025-08-14T21:20:14.5039641Z * [new branch] cherry-pick-158572-by-pytorch_bot_bot_ -> origin/cherry-pick-158572-by-pytorch_bot_bot_ 2025-08-14T21:20:14.5041742Z * [new branch] cherry-pick-158595 -> origin/cherry-pick-158595 2025-08-14T21:20:14.5044048Z * [new branch] cherry-pick-159181-by-pytorch_bot_bot_ -> origin/cherry-pick-159181-by-pytorch_bot_bot_ 2025-08-14T21:20:14.5046483Z * [new branch] cherry-pick-159969-by-pytorch_bot_bot_ -> origin/cherry-pick-159969-by-pytorch_bot_bot_ 2025-08-14T21:20:14.5049000Z * [new branch] cherry-pick-160586-by-pytorch_bot_bot_ -> origin/cherry-pick-160586-by-pytorch_bot_bot_ 2025-08-14T21:20:14.5051018Z * [new branch] cherry-pick-PR-158746 -> origin/cherry-pick-PR-158746 2025-08-14T21:20:14.5053392Z * [new branch] cherrypick-e4e2701429c17078c3c475382a8b1fa4c8a8cefc -> origin/cherrypick-e4e2701429c17078c3c475382a8b1fa4c8a8cefc 2025-08-14T21:20:14.5056077Z * [new branch] chilli/flex_vllm -> origin/chilli/flex_vllm 2025-08-14T21:20:14.5058409Z * [new branch] ckluk2-compileThread-1 -> origin/ckluk2-compileThread-1 2025-08-14T21:20:14.5060566Z * [new branch] ckluk2-compileThread-2 -> origin/ckluk2-compileThread-2 2025-08-14T21:20:14.5062642Z * [new branch] ckluk2-compileThread-64 -> origin/ckluk2-compileThread-64 2025-08-14T21:20:14.5064735Z * [new branch] ckluk2-test-1 -> origin/ckluk2-test-1 2025-08-14T21:20:14.5066992Z * [new branch] cleantest1 -> origin/cleantest1 2025-08-14T21:20:14.5069186Z * [new branch] codex-testing -> origin/codex-testing 2025-08-14T21:20:14.5072336Z * [new branch] codex/create-test-for-tensor-memory-leak-in-cudagraph -> origin/codex/create-test-for-tensor-memory-leak-in-cudagraph 2025-08-14T21:20:14.5073937Z * [new branch] codex/fix-issue-121219-in-pytorch -> origin/codex/fix-issue-121219-in-pytorch 2025-08-14T21:20:14.5076169Z * [new branch] codex/fix-issue-160415-in-pytorch -> origin/codex/fix-issue-160415-in-pytorch 2025-08-14T21:20:14.5079084Z * [new branch] codex/fix-noqengine-quantized-engine-support -> origin/codex/fix-noqengine-quantized-engine-support 2025-08-14T21:20:14.5081554Z * [new branch] codex/fix-pin_memory-error-handling -> origin/codex/fix-pin_memory-error-handling 2025-08-14T21:20:14.5083728Z * [new branch] codex/propose-fix-for-issue-160332 -> origin/codex/propose-fix-for-issue-160332 2025-08-14T21:20:14.5086377Z * [new branch] codex/refactor-lintrunner-config-to-use-uv-run -> origin/codex/refactor-lintrunner-config-to-use-uv-run 2025-08-14T21:20:14.5088362Z * [new branch] codex/verify-torch-output-and-log-results -> origin/codex/verify-torch-output-and-log-results 2025-08-14T21:20:14.5090327Z * [new branch] compile_fsdp2_disable_stream_and_event -> origin/compile_fsdp2_disable_stream_and_event 2025-08-14T21:20:14.5092451Z * [new branch] comply-with-setuptools -> origin/comply-with-setuptools 2025-08-14T21:20:14.5094659Z * [new branch] context_test -> origin/context_test 2025-08-14T21:20:14.5097668Z * [new branch] copilot/fix-157446 -> origin/copilot/fix-157446 2025-08-14T21:20:14.5099693Z * [new branch] copilot/fix-159257 -> origin/copilot/fix-159257 2025-08-14T21:20:14.5101789Z * [new branch] copy_graph -> origin/copy_graph 2025-08-14T21:20:14.5104659Z * [new branch] cpio/fix_new_ami_tests -> origin/cpio/fix_new_ami_tests 2025-08-14T21:20:14.5107415Z * [new branch] csl/3_proc_sm -> origin/csl/3_proc_sm 2025-08-14T21:20:14.5109481Z * [new branch] csl/add_file_merge_conflict_csv -> origin/csl/add_file_merge_conflict_csv 2025-08-14T21:20:14.5111433Z * [new branch] csl/always_produce_xml -> origin/csl/always_produce_xml 2025-08-14T21:20:14.5113564Z * [new branch] csl/build_test_more_procs -> origin/csl/build_test_more_procs 2025-08-14T21:20:14.5115585Z * [new branch] csl/build_test_more_procs2 -> origin/csl/build_test_more_procs2 2025-08-14T21:20:14.5118004Z * [new branch] csl/disable_flaky_cpp_test -> origin/csl/disable_flaky_cpp_test 2025-08-14T21:20:14.5120523Z * [new branch] csl/disable_periodic_test -> origin/csl/disable_periodic_test 2025-08-14T21:20:14.5122627Z * [new branch] csl/executorch_docker_fail -> origin/csl/executorch_docker_fail 2025-08-14T21:20:14.5124733Z * [new branch] csl/fix_check_alerts -> origin/csl/fix_check_alerts 2025-08-14T21:20:14.5126821Z * [new branch] csl/katex -> origin/csl/katex 2025-08-14T21:20:14.5129027Z * [new branch] csl/larger_runner -> origin/csl/larger_runner 2025-08-14T21:20:14.5131249Z * [new branch] csl/lintrunner_changed_files_removed -> origin/csl/lintrunner_changed_files_removed 2025-08-14T21:20:14.5133380Z * [new branch] csl/lintrunner_changed_files_removed_test -> origin/csl/lintrunner_changed_files_removed_test 2025-08-14T21:20:14.5135411Z * [new branch] csl/lintrunner_stuff -> origin/csl/lintrunner_stuff 2025-08-14T21:20:14.5137503Z * [new branch] csl/mps_sharding -> origin/csl/mps_sharding 2025-08-14T21:20:14.5139737Z * [new branch] csl/multistage_docker -> origin/csl/multistage_docker 2025-08-14T21:20:14.5141752Z * [new branch] csl/no_keep_goin_rocm -> origin/csl/no_keep_goin_rocm 2025-08-14T21:20:14.5143837Z * [new branch] csl/not_600_timeout -> origin/csl/not_600_timeout 2025-08-14T21:20:14.5146038Z * [new branch] csl/remove_unused_docker_images -> origin/csl/remove_unused_docker_images 2025-08-14T21:20:14.5148964Z * [new branch] csl/revert_open -> origin/csl/revert_open 2025-08-14T21:20:14.5151177Z * [new branch] csl/rocm_upload_artifacts_while_running -> origin/csl/rocm_upload_artifacts_while_running 2025-08-14T21:20:14.5153200Z * [new branch] csl/skip_build -> origin/csl/skip_build 2025-08-14T21:20:14.5155356Z * [new branch] csl/td_dynamo -> origin/csl/td_dynamo 2025-08-14T21:20:14.5157531Z * [new branch] csl/test_cuda_build_large_runner -> origin/csl/test_cuda_build_large_runner 2025-08-14T21:20:14.5159817Z * [new branch] csl/unused_docker -> origin/csl/unused_docker 2025-08-14T21:20:14.5161798Z * [new branch] csl/win_sccache -> origin/csl/win_sccache 2025-08-14T21:20:14.5163948Z * [new branch] cublasltrelax2 -> origin/cublasltrelax2 2025-08-14T21:20:14.5166201Z * [new branch] cublasrelax2 -> origin/cublasrelax2 2025-08-14T21:20:14.5168341Z * [new branch] cudnnsdparefactor -> origin/cudnnsdparefactor 2025-08-14T21:20:14.5170541Z * [new branch] custom_lowering_dict -> origin/custom_lowering_dict 2025-08-14T21:20:14.5172576Z * [new branch] czhuge_muon_dev -> origin/czhuge_muon_dev 2025-08-14T21:20:14.5175493Z * [new branch] d4l3k/delete_hook -> origin/d4l3k/delete_hook 2025-08-14T21:20:14.5177613Z * [new branch] d4l3k/dist_queue -> origin/d4l3k/dist_queue 2025-08-14T21:20:14.5179589Z * [new branch] d4l3k/wait_stream -> origin/d4l3k/wait_stream 2025-08-14T21:20:14.5181735Z * [new branch] dcp-safetensor-test-fix -> origin/dcp-safetensor-test-fix 2025-08-14T21:20:14.5183800Z * [new branch] dcp_zoc -> origin/dcp_zoc 2025-08-14T21:20:14.5185977Z * [new branch] delete-quant-docs -> origin/delete-quant-docs 2025-08-14T21:20:14.5190751Z * [new branch] dependabot/pip/dot-ci/docker/protobuf-5.29.5 -> origin/dependabot/pip/dot-ci/docker/protobuf-5.29.5 2025-08-14T21:20:14.5193415Z * [new branch] desertfire/test_cpp_wrapper -> origin/desertfire/test_cpp_wrapper 2025-08-14T21:20:14.5195529Z * [new branch] desertfire/triton-cpu-for-aarch64 -> origin/desertfire/triton-cpu-for-aarch64 2025-08-14T21:20:14.5199170Z * [new branch] dev/joona/MPSNDArrayAdd -> origin/dev/joona/MPSNDArrayAdd 2025-08-14T21:20:14.5201503Z * [new branch] dev/joona/Unranked -> origin/dev/joona/Unranked 2025-08-14T21:20:14.5203853Z * [new branch] dev/joona/cat -> origin/dev/joona/cat 2025-08-14T21:20:14.5206295Z * [new branch] dev/joona/cat_remove_graph -> origin/dev/joona/cat_remove_graph 2025-08-14T21:20:14.5208323Z * [new branch] dev/joona/embeddingbag -> origin/dev/joona/embeddingbag 2025-08-14T21:20:14.5210599Z * [new branch] dev/joona/getTensorsString -> origin/dev/joona/getTensorsString 2025-08-14T21:20:14.5213273Z * [new branch] dev/joona/maxpool2dwithindices_errmsg -> origin/dev/joona/maxpool2dwithindices_errmsg 2025-08-14T21:20:14.5215899Z * [new branch] dev/joona/mps_linear_macos14 -> origin/dev/joona/mps_linear_macos14 2025-08-14T21:20:14.5218642Z * [new branch] dev/joona/sdpa -> origin/dev/joona/sdpa 2025-08-14T21:20:14.5221021Z * [new branch] dev/joona/synchronize_benchmark -> origin/dev/joona/synchronize_benchmark 2025-08-14T21:20:14.5223401Z * [new branch] dev/joona/topk_newapi -> origin/dev/joona/topk_newapi 2025-08-14T21:20:14.5225696Z * [new branch] dev/joona/type_inf -> origin/dev/joona/type_inf 2025-08-14T21:20:14.5228015Z * [new branch] dev/joona/upsize3d -> origin/dev/joona/upsize3d 2025-08-14T21:20:14.5230192Z * [new branch] disable -> origin/disable 2025-08-14T21:20:14.5232537Z * [new branch] divyanshk-log-api-usage-datapipes-1 -> origin/divyanshk-log-api-usage-datapipes-1 2025-08-14T21:20:14.5234530Z * [new branch] e2e-baseline -> origin/e2e-baseline 2025-08-14T21:20:14.5237476Z * [new branch] embg/test_inductor_ci_128B -> origin/embg/test_inductor_ci_128B 2025-08-14T21:20:14.5239527Z * [new branch] embg/test_inductor_ci_base -> origin/embg/test_inductor_ci_base 2025-08-14T21:20:14.5241747Z * [new branch] embg/test_inductor_ci_control -> origin/embg/test_inductor_ci_control 2025-08-14T21:20:14.5243669Z * [new branch] embg/triton_l2_prefetch_128B -> origin/embg/triton_l2_prefetch_128B 2025-08-14T21:20:14.5246159Z * [new branch] embg/triton_l2_prefetch_256B -> origin/embg/triton_l2_prefetch_256B 2025-08-14T21:20:14.5248655Z * [new branch] enable-b200-benchmark -> origin/enable-b200-benchmark 2025-08-14T21:20:14.5250764Z * [new branch] eqy-patch-1 -> origin/eqy-patch-1 2025-08-14T21:20:14.5252829Z * [new branch] eqy-patch-10 -> origin/eqy-patch-10 2025-08-14T21:20:14.5255061Z * [new branch] eqy-patch-2 -> origin/eqy-patch-2 2025-08-14T21:20:14.5257231Z * [new branch] example-convert-torch.nn -> origin/example-convert-torch.nn 2025-08-14T21:20:14.5260040Z * [new branch] exclamaforte/amd-ma -> origin/exclamaforte/amd-ma 2025-08-14T21:20:14.5262156Z * [new branch] exclamaforte/bump-transformer-version -> origin/exclamaforte/bump-transformer-version 2025-08-14T21:20:14.5264147Z * [new branch] exclamaforte/combo-kernels-perf-run -> origin/exclamaforte/combo-kernels-perf-run 2025-08-14T21:20:14.5266247Z * [new branch] exclamaforte/debug-autotuner-profile -> origin/exclamaforte/debug-autotuner-profile 2025-08-14T21:20:14.5268237Z * [new branch] exclamaforte/do_bench_refactor -> origin/exclamaforte/do_bench_refactor 2025-08-14T21:20:14.5270712Z * [new branch] exclamaforte/enable-mem-dep-fusion -> origin/exclamaforte/enable-mem-dep-fusion 2025-08-14T21:20:14.5273504Z * [new branch] exclamaforte/fix-exhaustive-autotuning -> origin/exclamaforte/fix-exhaustive-autotuning 2025-08-14T21:20:14.5275699Z * [new branch] exclamaforte/fix-trace-parsing-fx-svg -> origin/exclamaforte/fix-trace-parsing-fx-svg 2025-08-14T21:20:14.5277771Z * [new branch] exclamaforte/force-pointwise-cat-perf-run -> origin/exclamaforte/force-pointwise-cat-perf-run 2025-08-14T21:20:14.5279730Z * [new branch] exclamaforte/fusion-data -> origin/exclamaforte/fusion-data 2025-08-14T21:20:14.5281854Z * [new branch] exclamaforte/gemm-benchmark-run -> origin/exclamaforte/gemm-benchmark-run 2025-08-14T21:20:14.5283920Z * [new branch] exclamaforte/gemm-export-model -> origin/exclamaforte/gemm-export-model 2025-08-14T21:20:14.5286296Z * [new branch] exclamaforte/gemm-model -> origin/exclamaforte/gemm-model 2025-08-14T21:20:14.5288554Z * [new branch] exclamaforte/gemm-model-all-data-collection -> origin/exclamaforte/gemm-model-all-data-collection 2025-08-14T21:20:14.5290559Z * [new branch] exclamaforte/gemm-to-amd -> origin/exclamaforte/gemm-to-amd 2025-08-14T21:20:14.5292701Z * [new branch] exclamaforte/just-gemm-model -> origin/exclamaforte/just-gemm-model 2025-08-14T21:20:14.5294928Z * [new branch] exclamaforte/just-gemm-model-no-refactor -> origin/exclamaforte/just-gemm-model-no-refactor 2025-08-14T21:20:14.5296929Z * [new branch] exclamaforte/memory-counter -> origin/exclamaforte/memory-counter 2025-08-14T21:20:14.5299102Z * [new branch] exclamaforte/scheduler-refactor -> origin/exclamaforte/scheduler-refactor 2025-08-14T21:20:14.5301335Z * [new branch] exclamaforte/test_cpp_wrapper_mode -> origin/exclamaforte/test_cpp_wrapper_mode 2025-08-14T21:20:14.5303467Z * [new branch] exclamaforte/update-autotune-configs -> origin/exclamaforte/update-autotune-configs 2025-08-14T21:20:14.5305566Z * [new branch] exclamaforte/update-autotune-configs-2 -> origin/exclamaforte/update-autotune-configs-2 2025-08-14T21:20:14.5307657Z * [new branch] exclamaforte/update-pandas-numpy-ci -> origin/exclamaforte/update-pandas-numpy-ci 2025-08-14T21:20:14.5310604Z * [new branch] exclamforte/gemm-model-final -> origin/exclamforte/gemm-model-final 2025-08-14T21:20:14.5312633Z * [new branch] exec -> origin/exec 2025-08-14T21:20:14.5314903Z * [new branch] experimental-mosaic -> origin/experimental-mosaic 2025-08-14T21:20:14.5317056Z * [new branch] export-D58091437 -> origin/export-D58091437 2025-08-14T21:20:14.5319326Z * [new branch] export-D61047529 -> origin/export-D61047529 2025-08-14T21:20:14.5321459Z * [new branch] export-D68846308 -> origin/export-D68846308 2025-08-14T21:20:14.5323634Z * [new branch] export-D70112642 -> origin/export-D70112642 2025-08-14T21:20:14.5326557Z * [new branch] export-D71412006 -> origin/export-D71412006 2025-08-14T21:20:14.5328714Z * [new branch] export-D72483950 -> origin/export-D72483950 2025-08-14T21:20:14.5330941Z * [new branch] export-D73042989 -> origin/export-D73042989 2025-08-14T21:20:14.5333003Z * [new branch] export-D73287751 -> origin/export-D73287751 2025-08-14T21:20:14.5335251Z * [new branch] export-D75183591 -> origin/export-D75183591 2025-08-14T21:20:14.5337425Z * [new branch] export-D75605373 -> origin/export-D75605373 2025-08-14T21:20:14.5339691Z * [new branch] export-D75617432 -> origin/export-D75617432 2025-08-14T21:20:14.5341810Z * [new branch] export-D75659965 -> origin/export-D75659965 2025-08-14T21:20:14.5344075Z * [new branch] export-D76080931 -> origin/export-D76080931 2025-08-14T21:20:14.5346225Z * [new branch] export-D76463347 -> origin/export-D76463347 2025-08-14T21:20:14.5348642Z * [new branch] export-D76797250 -> origin/export-D76797250 2025-08-14T21:20:14.5350780Z * [new branch] export-D76885271 -> origin/export-D76885271 2025-08-14T21:20:14.5352859Z * [new branch] export-D76885620 -> origin/export-D76885620 2025-08-14T21:20:14.5355545Z * [new branch] export-D76936623 -> origin/export-D76936623 2025-08-14T21:20:14.5357590Z * [new branch] export-D76958268 -> origin/export-D76958268 2025-08-14T21:20:14.5359727Z * [new branch] export-D78047846 -> origin/export-D78047846 2025-08-14T21:20:14.5361821Z * [new branch] export-D78308105 -> origin/export-D78308105 2025-08-14T21:20:14.5363936Z * [new branch] export-D78363609 -> origin/export-D78363609 2025-08-14T21:20:14.5366249Z * [new branch] export-D78375400 -> origin/export-D78375400 2025-08-14T21:20:14.5368366Z * [new branch] export-D78431075 -> origin/export-D78431075 2025-08-14T21:20:14.5370521Z * [new branch] export-D78431305 -> origin/export-D78431305 2025-08-14T21:20:14.5372910Z * [new branch] export-D78458745 -> origin/export-D78458745 2025-08-14T21:20:14.5375151Z * [new branch] export-D78524147 -> origin/export-D78524147 2025-08-14T21:20:14.5377277Z * [new branch] export-D78580107 -> origin/export-D78580107 2025-08-14T21:20:14.5379389Z * [new branch] export-D78588406 -> origin/export-D78588406 2025-08-14T21:20:14.5381581Z * [new branch] export-D78691422 -> origin/export-D78691422 2025-08-14T21:20:14.5383778Z * [new branch] export-D78758466 -> origin/export-D78758466 2025-08-14T21:20:14.5385936Z * [new branch] export-D78822171 -> origin/export-D78822171 2025-08-14T21:20:14.5388037Z * [new branch] export-D78822351 -> origin/export-D78822351 2025-08-14T21:20:14.5390063Z * [new branch] export-D78822507 -> origin/export-D78822507 2025-08-14T21:20:14.5392306Z * [new branch] export-D78826994 -> origin/export-D78826994 2025-08-14T21:20:14.5394256Z * [new branch] export-D78894142 -> origin/export-D78894142 2025-08-14T21:20:14.5396341Z * [new branch] export-D78894324 -> origin/export-D78894324 2025-08-14T21:20:14.5398374Z * [new branch] export-D78907485 -> origin/export-D78907485 2025-08-14T21:20:14.5400509Z * [new branch] export-D78929245 -> origin/export-D78929245 2025-08-14T21:20:14.5402603Z * [new branch] export-D78934925 -> origin/export-D78934925 2025-08-14T21:20:14.5404802Z * [new branch] export-D78953203 -> origin/export-D78953203 2025-08-14T21:20:14.5406922Z * [new branch] export-D78953229 -> origin/export-D78953229 2025-08-14T21:20:14.5409015Z * [new branch] export-D78957093 -> origin/export-D78957093 2025-08-14T21:20:14.5411126Z * [new branch] export-D78957389 -> origin/export-D78957389 2025-08-14T21:20:14.5413194Z * [new branch] export-D78957974 -> origin/export-D78957974 2025-08-14T21:20:14.5415331Z * [new branch] export-D78979812 -> origin/export-D78979812 2025-08-14T21:20:14.5417401Z * [new branch] export-D78996107 -> origin/export-D78996107 2025-08-14T21:20:14.5419514Z * [new branch] export-D79026433 -> origin/export-D79026433 2025-08-14T21:20:14.5421646Z * [new branch] export-D79230339 -> origin/export-D79230339 2025-08-14T21:20:14.5424042Z * [new branch] export-D79319835 -> origin/export-D79319835 2025-08-14T21:20:14.5426133Z * [new branch] export-D79328456 -> origin/export-D79328456 2025-08-14T21:20:14.5428424Z * [new branch] export-D79534608 -> origin/export-D79534608 2025-08-14T21:20:14.5430604Z * [new branch] export-D79647167 -> origin/export-D79647167 2025-08-14T21:20:14.5432807Z * [new branch] export-D79751098 -> origin/export-D79751098 2025-08-14T21:20:14.5435171Z * [new branch] export-D79785974 -> origin/export-D79785974 2025-08-14T21:20:14.5437375Z * [new branch] export-D80025417 -> origin/export-D80025417 2025-08-14T21:20:14.5439600Z * [new branch] export-D80120333 -> origin/export-D80120333 2025-08-14T21:20:14.5441823Z * [new branch] export-D80214882 -> origin/export-D80214882 2025-08-14T21:20:14.5444124Z * [new branch] exported-model-train-idempotent -> origin/exported-model-train-idempotent 2025-08-14T21:20:14.5447318Z * [new branch] ezyang/wip-aot-descriptors -> origin/ezyang/wip-aot-descriptors 2025-08-14T21:20:14.5449732Z * [new branch] fa_u8_brgemm -> origin/fa_u8_brgemm 2025-08-14T21:20:14.5451855Z * [new branch] fastmath_baseline -> origin/fastmath_baseline 2025-08-14T21:20:14.5454898Z * [new branch] fbcode/warm -> origin/fbcode/warm 2025-08-14T21:20:14.5457172Z * [new branch] fca -> origin/fca 2025-08-14T21:20:14.5459341Z * [new branch] fca2_ca5984c -> origin/fca2_ca5984c 2025-08-14T21:20:14.5461463Z * [new branch] fca5 -> origin/fca5 2025-08-14T21:20:14.5464415Z * [new branch] feature/function-numa-binding -> origin/feature/function-numa-binding 2025-08-14T21:20:14.5467324Z * [new branch] fengyuan/external-proj -> origin/fengyuan/external-proj 2025-08-14T21:20:14.5469521Z * [new branch] fengyuan/out-of-tree-xpu-ops-improve-test -> origin/fengyuan/out-of-tree-xpu-ops-improve-test 2025-08-14T21:20:14.5471726Z * [new branch] fengyuan/out-of-tree-xpu-ops-remove-dtype -> origin/fengyuan/out-of-tree-xpu-ops-remove-dtype 2025-08-14T21:20:14.5473847Z * [new branch] fengyuan/test-xpu -> origin/fengyuan/test-xpu 2025-08-14T21:20:14.5476469Z * [new branch] ffast_math_baseline -> origin/ffast_math_baseline 2025-08-14T21:20:14.5478612Z * [new branch] ffast_math_target -> origin/ffast_math_target 2025-08-14T21:20:14.5481842Z * [new branch] findhao/base_commit -> origin/findhao/base_commit 2025-08-14T21:20:14.5483887Z * [new branch] findhao/base_commit1 -> origin/findhao/base_commit1 2025-08-14T21:20:14.5486137Z * [new branch] findhao/fix-indirect-access -> origin/findhao/fix-indirect-access 2025-08-14T21:20:14.5488093Z * [new branch] findhao/multistream2 -> origin/findhao/multistream2 2025-08-14T21:20:14.5490006Z * [new branch] findhao/multistream5 -> origin/findhao/multistream5 2025-08-14T21:20:14.5491981Z * [new branch] findhao/multistream6 -> origin/findhao/multistream6 2025-08-14T21:20:14.5493961Z * [new branch] findhao/operatorbench3 -> origin/findhao/operatorbench3 2025-08-14T21:20:14.5496554Z * [new branch] findhao/operatorbench5 -> origin/findhao/operatorbench5 2025-08-14T21:20:14.5499094Z * [new branch] findhao/tritonparse -> origin/findhao/tritonparse 2025-08-14T21:20:14.5501309Z * [new branch] fix -> origin/fix 2025-08-14T21:20:14.5503586Z * [new branch] fix-ck-gemm-template-format -> origin/fix-ck-gemm-template-format 2025-08-14T21:20:14.5505653Z * [new branch] fix-config-ignore -> origin/fix-config-ignore 2025-08-14T21:20:14.5507797Z * [new branch] fix-dict-guard -> origin/fix-dict-guard 2025-08-14T21:20:14.5510079Z * [new branch] fix-distributed-warning -> origin/fix-distributed-warning 2025-08-14T21:20:14.5512337Z * [new branch] fix-inductor-periodic-0528 -> origin/fix-inductor-periodic-0528 2025-08-14T21:20:14.5514610Z * [new branch] fix-rlease-feature-template -> origin/fix-rlease-feature-template 2025-08-14T21:20:14.5516642Z * [new branch] fix_153389 -> origin/fix_153389 2025-08-14T21:20:14.5518874Z * [new branch] fixes-triage -> origin/fixes-triage 2025-08-14T21:20:14.5521034Z * [new branch] flash_decoding_cpu -> origin/flash_decoding_cpu 2025-08-14T21:20:14.5523132Z * [new branch] flex-flash -> origin/flex-flash 2025-08-14T21:20:14.5525396Z * [new branch] flex-lowering -> origin/flex-lowering 2025-08-14T21:20:14.5527506Z * [new branch] flex-warning -> origin/flex-warning 2025-08-14T21:20:14.5529781Z * [new branch] flex_attention_functorch_grad -> origin/flex_attention_functorch_grad 2025-08-14T21:20:14.5531839Z * [new branch] flex_flash -> origin/flex_flash 2025-08-14T21:20:14.5535338Z * [new branch] fmassa/fix_memeff_sharding_rule -> origin/fmassa/fix_memeff_sharding_rule 2025-08-14T21:20:14.5537338Z * [new branch] fmassa/try_fix_ac_tag_propagation -> origin/fmassa/try_fix_ac_tag_propagation 2025-08-14T21:20:14.5539507Z * [new branch] fsdp2_trace_rules -> origin/fsdp2_trace_rules 2025-08-14T21:20:14.5541712Z * [new branch] fsdpv2_3d -> origin/fsdpv2_3d 2025-08-14T21:20:14.5544041Z * [new branch] fsdpv2_3d_m1 -> origin/fsdpv2_3d_m1 2025-08-14T21:20:14.5546292Z * [new branch] fx_cpp -> origin/fx_cpp 2025-08-14T21:20:14.5549616Z * [new branch] fy/fix-win -> origin/fy/fix-win 2025-08-14T21:20:14.5553983Z * [new branch] gh/AlnisM/1/base -> origin/gh/AlnisM/1/base 2025-08-14T21:20:14.5556209Z * [new branch] gh/AlnisM/1/head -> origin/gh/AlnisM/1/head 2025-08-14T21:20:14.5559424Z * [new branch] gh/CaoE/2/base -> origin/gh/CaoE/2/base 2025-08-14T21:20:14.5561639Z * [new branch] gh/CaoE/2/head -> origin/gh/CaoE/2/head 2025-08-14T21:20:14.5563549Z * [new branch] gh/CaoE/2/orig -> origin/gh/CaoE/2/orig 2025-08-14T21:20:14.5575755Z * [new branch] gh/ColinPeppler/72/base -> origin/gh/ColinPeppler/72/base 2025-08-14T21:20:14.5576344Z * [new branch] gh/ColinPeppler/72/head -> origin/gh/ColinPeppler/72/head 2025-08-14T21:20:14.5577045Z * [new branch] gh/ColinPeppler/72/orig -> origin/gh/ColinPeppler/72/orig 2025-08-14T21:20:14.5577612Z * [new branch] gh/ColinPeppler/77/base -> origin/gh/ColinPeppler/77/base 2025-08-14T21:20:14.5578189Z * [new branch] gh/ColinPeppler/77/head -> origin/gh/ColinPeppler/77/head 2025-08-14T21:20:14.5578744Z * [new branch] gh/ColinPeppler/77/orig -> origin/gh/ColinPeppler/77/orig 2025-08-14T21:20:14.5580609Z * [new branch] gh/ColinPeppler/78/base -> origin/gh/ColinPeppler/78/base 2025-08-14T21:20:14.5582453Z * [new branch] gh/ColinPeppler/78/head -> origin/gh/ColinPeppler/78/head 2025-08-14T21:20:14.5584228Z * [new branch] gh/ColinPeppler/78/orig -> origin/gh/ColinPeppler/78/orig 2025-08-14T21:20:14.5587476Z * [new branch] gh/EikanWang/67/base -> origin/gh/EikanWang/67/base 2025-08-14T21:20:14.5589317Z * [new branch] gh/EikanWang/67/head -> origin/gh/EikanWang/67/head 2025-08-14T21:20:14.5591853Z * [new branch] gh/EikanWang/80/base -> origin/gh/EikanWang/80/base 2025-08-14T21:20:14.5593746Z * [new branch] gh/EikanWang/80/head -> origin/gh/EikanWang/80/head 2025-08-14T21:20:14.5595610Z * [new branch] gh/EikanWang/80/orig -> origin/gh/EikanWang/80/orig 2025-08-14T21:20:14.5598116Z * [new branch] gh/EikanWang/81/base -> origin/gh/EikanWang/81/base 2025-08-14T21:20:14.5599948Z * [new branch] gh/EikanWang/81/head -> origin/gh/EikanWang/81/head 2025-08-14T21:20:14.5601760Z * [new branch] gh/EikanWang/81/orig -> origin/gh/EikanWang/81/orig 2025-08-14T21:20:14.5605443Z * [new branch] gh/Gasoonjia/1/base -> origin/gh/Gasoonjia/1/base 2025-08-14T21:20:14.5607330Z * [new branch] gh/Gasoonjia/1/head -> origin/gh/Gasoonjia/1/head 2025-08-14T21:20:14.5610404Z * [new branch] gh/H-Huang/131/base -> origin/gh/H-Huang/131/base 2025-08-14T21:20:14.5612194Z * [new branch] gh/H-Huang/131/head -> origin/gh/H-Huang/131/head 2025-08-14T21:20:14.5613975Z * [new branch] gh/H-Huang/131/orig -> origin/gh/H-Huang/131/orig 2025-08-14T21:20:14.5616570Z * [new branch] gh/H-Huang/132/base -> origin/gh/H-Huang/132/base 2025-08-14T21:20:14.5618354Z * [new branch] gh/H-Huang/132/head -> origin/gh/H-Huang/132/head 2025-08-14T21:20:14.5620113Z * [new branch] gh/H-Huang/132/orig -> origin/gh/H-Huang/132/orig 2025-08-14T21:20:14.5622666Z * [new branch] gh/H-Huang/180/base -> origin/gh/H-Huang/180/base 2025-08-14T21:20:14.5624501Z * [new branch] gh/H-Huang/180/head -> origin/gh/H-Huang/180/head 2025-08-14T21:20:14.5626307Z * [new branch] gh/H-Huang/180/orig -> origin/gh/H-Huang/180/orig 2025-08-14T21:20:14.5628782Z * [new branch] gh/H-Huang/182/base -> origin/gh/H-Huang/182/base 2025-08-14T21:20:14.5630567Z * [new branch] gh/H-Huang/182/head -> origin/gh/H-Huang/182/head 2025-08-14T21:20:14.5632344Z * [new branch] gh/H-Huang/182/orig -> origin/gh/H-Huang/182/orig 2025-08-14T21:20:14.5635122Z * [new branch] gh/H-Huang/183/base -> origin/gh/H-Huang/183/base 2025-08-14T21:20:14.5636732Z * [new branch] gh/H-Huang/183/head -> origin/gh/H-Huang/183/head 2025-08-14T21:20:14.5638489Z * [new branch] gh/H-Huang/183/orig -> origin/gh/H-Huang/183/orig 2025-08-14T21:20:14.5641058Z * [new branch] gh/H-Huang/187/base -> origin/gh/H-Huang/187/base 2025-08-14T21:20:14.5642759Z * [new branch] gh/H-Huang/187/head -> origin/gh/H-Huang/187/head 2025-08-14T21:20:14.5644563Z * [new branch] gh/H-Huang/187/orig -> origin/gh/H-Huang/187/orig 2025-08-14T21:20:14.5647648Z * [new branch] gh/H-Huang/192/base -> origin/gh/H-Huang/192/base 2025-08-14T21:20:14.5651275Z * [new branch] gh/H-Huang/192/head -> origin/gh/H-Huang/192/head 2025-08-14T21:20:14.5653026Z * [new branch] gh/H-Huang/192/orig -> origin/gh/H-Huang/192/orig 2025-08-14T21:20:14.5655544Z * [new branch] gh/H-Huang/195/base -> origin/gh/H-Huang/195/base 2025-08-14T21:20:14.5657359Z * [new branch] gh/H-Huang/195/head -> origin/gh/H-Huang/195/head 2025-08-14T21:20:14.5659156Z * [new branch] gh/H-Huang/195/orig -> origin/gh/H-Huang/195/orig 2025-08-14T21:20:14.5661865Z * [new branch] gh/H-Huang/196/base -> origin/gh/H-Huang/196/base 2025-08-14T21:20:14.5663586Z * [new branch] gh/H-Huang/196/head -> origin/gh/H-Huang/196/head 2025-08-14T21:20:14.5665346Z * [new branch] gh/H-Huang/196/orig -> origin/gh/H-Huang/196/orig 2025-08-14T21:20:14.5667987Z * [new branch] gh/H-Huang/197/base -> origin/gh/H-Huang/197/base 2025-08-14T21:20:14.5670028Z * [new branch] gh/H-Huang/197/head -> origin/gh/H-Huang/197/head 2025-08-14T21:20:14.5671912Z * [new branch] gh/H-Huang/197/orig -> origin/gh/H-Huang/197/orig 2025-08-14T21:20:14.5674515Z * [new branch] gh/H-Huang/198/base -> origin/gh/H-Huang/198/base 2025-08-14T21:20:14.5676398Z * [new branch] gh/H-Huang/198/head -> origin/gh/H-Huang/198/head 2025-08-14T21:20:14.5678145Z * [new branch] gh/H-Huang/198/orig -> origin/gh/H-Huang/198/orig 2025-08-14T21:20:14.5680491Z * [new branch] gh/H-Huang/199/base -> origin/gh/H-Huang/199/base 2025-08-14T21:20:14.5682260Z * [new branch] gh/H-Huang/199/head -> origin/gh/H-Huang/199/head 2025-08-14T21:20:14.5684048Z * [new branch] gh/H-Huang/199/orig -> origin/gh/H-Huang/199/orig 2025-08-14T21:20:14.5686800Z * [new branch] gh/H-Huang/200/base -> origin/gh/H-Huang/200/base 2025-08-14T21:20:14.5688589Z * [new branch] gh/H-Huang/200/head -> origin/gh/H-Huang/200/head 2025-08-14T21:20:14.5690506Z * [new branch] gh/H-Huang/200/orig -> origin/gh/H-Huang/200/orig 2025-08-14T21:20:14.5692816Z * [new branch] gh/H-Huang/201/base -> origin/gh/H-Huang/201/base 2025-08-14T21:20:14.5694609Z * [new branch] gh/H-Huang/201/head -> origin/gh/H-Huang/201/head 2025-08-14T21:20:14.5696401Z * [new branch] gh/H-Huang/201/orig -> origin/gh/H-Huang/201/orig 2025-08-14T21:20:14.5698948Z * [new branch] gh/H-Huang/202/base -> origin/gh/H-Huang/202/base 2025-08-14T21:20:14.5700918Z * [new branch] gh/H-Huang/202/head -> origin/gh/H-Huang/202/head 2025-08-14T21:20:14.5702601Z * [new branch] gh/H-Huang/202/orig -> origin/gh/H-Huang/202/orig 2025-08-14T21:20:14.5705389Z * [new branch] gh/H-Huang/203/base -> origin/gh/H-Huang/203/base 2025-08-14T21:20:14.5707077Z * [new branch] gh/H-Huang/203/head -> origin/gh/H-Huang/203/head 2025-08-14T21:20:14.5709033Z * [new branch] gh/H-Huang/203/orig -> origin/gh/H-Huang/203/orig 2025-08-14T21:20:14.5712002Z * [new branch] gh/H-Huang/204/base -> origin/gh/H-Huang/204/base 2025-08-14T21:20:14.5713800Z * [new branch] gh/H-Huang/204/head -> origin/gh/H-Huang/204/head 2025-08-14T21:20:14.5715567Z * [new branch] gh/H-Huang/204/orig -> origin/gh/H-Huang/204/orig 2025-08-14T21:20:14.5718067Z * [new branch] gh/H-Huang/205/base -> origin/gh/H-Huang/205/base 2025-08-14T21:20:14.5719872Z * [new branch] gh/H-Huang/205/head -> origin/gh/H-Huang/205/head 2025-08-14T21:20:14.5721662Z * [new branch] gh/H-Huang/205/orig -> origin/gh/H-Huang/205/orig 2025-08-14T21:20:14.5724163Z * [new branch] gh/H-Huang/206/base -> origin/gh/H-Huang/206/base 2025-08-14T21:20:14.5726063Z * [new branch] gh/H-Huang/206/head -> origin/gh/H-Huang/206/head 2025-08-14T21:20:14.5727883Z * [new branch] gh/H-Huang/206/orig -> origin/gh/H-Huang/206/orig 2025-08-14T21:20:14.5730512Z * [new branch] gh/H-Huang/207/base -> origin/gh/H-Huang/207/base 2025-08-14T21:20:14.5732294Z * [new branch] gh/H-Huang/207/head -> origin/gh/H-Huang/207/head 2025-08-14T21:20:14.5734148Z * [new branch] gh/H-Huang/207/orig -> origin/gh/H-Huang/207/orig 2025-08-14T21:20:14.5736695Z * [new branch] gh/H-Huang/208/base -> origin/gh/H-Huang/208/base 2025-08-14T21:20:14.5738456Z * [new branch] gh/H-Huang/208/head -> origin/gh/H-Huang/208/head 2025-08-14T21:20:14.5740267Z * [new branch] gh/H-Huang/208/orig -> origin/gh/H-Huang/208/orig 2025-08-14T21:20:14.5742859Z * [new branch] gh/H-Huang/209/base -> origin/gh/H-Huang/209/base 2025-08-14T21:20:14.5744676Z * [new branch] gh/H-Huang/209/head -> origin/gh/H-Huang/209/head 2025-08-14T21:20:14.5746479Z * [new branch] gh/H-Huang/209/orig -> origin/gh/H-Huang/209/orig 2025-08-14T21:20:14.5750003Z * [new branch] gh/IvanKobzarev/107/base -> origin/gh/IvanKobzarev/107/base 2025-08-14T21:20:14.5751983Z * [new branch] gh/IvanKobzarev/107/head -> origin/gh/IvanKobzarev/107/head 2025-08-14T21:20:14.5753850Z * [new branch] gh/IvanKobzarev/107/orig -> origin/gh/IvanKobzarev/107/orig 2025-08-14T21:20:14.5756328Z * [new branch] gh/IvanKobzarev/110/base -> origin/gh/IvanKobzarev/110/base 2025-08-14T21:20:14.5758122Z * [new branch] gh/IvanKobzarev/110/head -> origin/gh/IvanKobzarev/110/head 2025-08-14T21:20:14.5760008Z * [new branch] gh/IvanKobzarev/110/orig -> origin/gh/IvanKobzarev/110/orig 2025-08-14T21:20:14.5762565Z * [new branch] gh/IvanKobzarev/111/base -> origin/gh/IvanKobzarev/111/base 2025-08-14T21:20:14.5764998Z * [new branch] gh/IvanKobzarev/111/head -> origin/gh/IvanKobzarev/111/head 2025-08-14T21:20:14.5766823Z * [new branch] gh/IvanKobzarev/111/orig -> origin/gh/IvanKobzarev/111/orig 2025-08-14T21:20:14.5769331Z * [new branch] gh/IvanKobzarev/112/base -> origin/gh/IvanKobzarev/112/base 2025-08-14T21:20:14.5771132Z * [new branch] gh/IvanKobzarev/112/head -> origin/gh/IvanKobzarev/112/head 2025-08-14T21:20:14.5772953Z * [new branch] gh/IvanKobzarev/112/orig -> origin/gh/IvanKobzarev/112/orig 2025-08-14T21:20:14.5775552Z * [new branch] gh/IvanKobzarev/115/base -> origin/gh/IvanKobzarev/115/base 2025-08-14T21:20:14.5777409Z * [new branch] gh/IvanKobzarev/115/head -> origin/gh/IvanKobzarev/115/head 2025-08-14T21:20:14.5779222Z * [new branch] gh/IvanKobzarev/115/orig -> origin/gh/IvanKobzarev/115/orig 2025-08-14T21:20:14.5782630Z * [new branch] gh/IvanKobzarev/116/base -> origin/gh/IvanKobzarev/116/base 2025-08-14T21:20:14.5784054Z * [new branch] gh/IvanKobzarev/116/head -> origin/gh/IvanKobzarev/116/head 2025-08-14T21:20:14.5785765Z * [new branch] gh/IvanKobzarev/116/orig -> origin/gh/IvanKobzarev/116/orig 2025-08-14T21:20:14.5788323Z * [new branch] gh/IvanKobzarev/118/base -> origin/gh/IvanKobzarev/118/base 2025-08-14T21:20:14.5790089Z * [new branch] gh/IvanKobzarev/118/head -> origin/gh/IvanKobzarev/118/head 2025-08-14T21:20:14.5791867Z * [new branch] gh/IvanKobzarev/118/orig -> origin/gh/IvanKobzarev/118/orig 2025-08-14T21:20:14.5794543Z * [new branch] gh/IvanKobzarev/124/base -> origin/gh/IvanKobzarev/124/base 2025-08-14T21:20:14.5796422Z * [new branch] gh/IvanKobzarev/124/head -> origin/gh/IvanKobzarev/124/head 2025-08-14T21:20:14.5798197Z * [new branch] gh/IvanKobzarev/124/orig -> origin/gh/IvanKobzarev/124/orig 2025-08-14T21:20:14.5800905Z * [new branch] gh/IvanKobzarev/126/base -> origin/gh/IvanKobzarev/126/base 2025-08-14T21:20:14.5802766Z * [new branch] gh/IvanKobzarev/126/head -> origin/gh/IvanKobzarev/126/head 2025-08-14T21:20:14.5804616Z * [new branch] gh/IvanKobzarev/126/orig -> origin/gh/IvanKobzarev/126/orig 2025-08-14T21:20:14.5807418Z * [new branch] gh/IvanKobzarev/127/base -> origin/gh/IvanKobzarev/127/base 2025-08-14T21:20:14.5809186Z * [new branch] gh/IvanKobzarev/127/head -> origin/gh/IvanKobzarev/127/head 2025-08-14T21:20:14.5810990Z * [new branch] gh/IvanKobzarev/127/orig -> origin/gh/IvanKobzarev/127/orig 2025-08-14T21:20:14.5813523Z * [new branch] gh/IvanKobzarev/128/base -> origin/gh/IvanKobzarev/128/base 2025-08-14T21:20:14.5815387Z * [new branch] gh/IvanKobzarev/128/head -> origin/gh/IvanKobzarev/128/head 2025-08-14T21:20:14.5817139Z * [new branch] gh/IvanKobzarev/128/orig -> origin/gh/IvanKobzarev/128/orig 2025-08-14T21:20:14.5819784Z * [new branch] gh/IvanKobzarev/129/base -> origin/gh/IvanKobzarev/129/base 2025-08-14T21:20:14.5821527Z * [new branch] gh/IvanKobzarev/129/head -> origin/gh/IvanKobzarev/129/head 2025-08-14T21:20:14.5823388Z * [new branch] gh/IvanKobzarev/129/orig -> origin/gh/IvanKobzarev/129/orig 2025-08-14T21:20:14.5826348Z * [new branch] gh/IvanKobzarev/130/base -> origin/gh/IvanKobzarev/130/base 2025-08-14T21:20:14.5828257Z * [new branch] gh/IvanKobzarev/130/head -> origin/gh/IvanKobzarev/130/head 2025-08-14T21:20:14.5830114Z * [new branch] gh/IvanKobzarev/130/orig -> origin/gh/IvanKobzarev/130/orig 2025-08-14T21:20:14.5832615Z * [new branch] gh/IvanKobzarev/131/base -> origin/gh/IvanKobzarev/131/base 2025-08-14T21:20:14.5834516Z * [new branch] gh/IvanKobzarev/131/head -> origin/gh/IvanKobzarev/131/head 2025-08-14T21:20:14.5836399Z * [new branch] gh/IvanKobzarev/131/orig -> origin/gh/IvanKobzarev/131/orig 2025-08-14T21:20:14.5838956Z * [new branch] gh/IvanKobzarev/132/base -> origin/gh/IvanKobzarev/132/base 2025-08-14T21:20:14.5840850Z * [new branch] gh/IvanKobzarev/132/head -> origin/gh/IvanKobzarev/132/head 2025-08-14T21:20:14.5842697Z * [new branch] gh/IvanKobzarev/132/orig -> origin/gh/IvanKobzarev/132/orig 2025-08-14T21:20:14.5845902Z * [new branch] gh/IvanKobzarev/133/base -> origin/gh/IvanKobzarev/133/base 2025-08-14T21:20:14.5848352Z * [new branch] gh/IvanKobzarev/133/head -> origin/gh/IvanKobzarev/133/head 2025-08-14T21:20:14.5850017Z * [new branch] gh/IvanKobzarev/133/orig -> origin/gh/IvanKobzarev/133/orig 2025-08-14T21:20:14.5853442Z * [new branch] gh/IvanKobzarev/134/base -> origin/gh/IvanKobzarev/134/base 2025-08-14T21:20:14.5855216Z * [new branch] gh/IvanKobzarev/134/head -> origin/gh/IvanKobzarev/134/head 2025-08-14T21:20:14.5856585Z * [new branch] gh/IvanKobzarev/134/orig -> origin/gh/IvanKobzarev/134/orig 2025-08-14T21:20:14.5859586Z * [new branch] gh/IvanKobzarev/135/base -> origin/gh/IvanKobzarev/135/base 2025-08-14T21:20:14.5861203Z * [new branch] gh/IvanKobzarev/135/head -> origin/gh/IvanKobzarev/135/head 2025-08-14T21:20:14.5863817Z * [new branch] gh/IvanKobzarev/135/orig -> origin/gh/IvanKobzarev/135/orig 2025-08-14T21:20:14.5867560Z * [new branch] gh/NikhilAPatel/1/base -> origin/gh/NikhilAPatel/1/base 2025-08-14T21:20:14.5869325Z * [new branch] gh/NikhilAPatel/1/head -> origin/gh/NikhilAPatel/1/head 2025-08-14T21:20:14.5872467Z * [new branch] gh/NikhilAPatel/16/base -> origin/gh/NikhilAPatel/16/base 2025-08-14T21:20:14.5874239Z * [new branch] gh/NikhilAPatel/16/head -> origin/gh/NikhilAPatel/16/head 2025-08-14T21:20:14.5877133Z * [new branch] gh/NikhilAPatel/16/orig -> origin/gh/NikhilAPatel/16/orig 2025-08-14T21:20:14.5878638Z * [new branch] gh/NikhilAPatel/18/base -> origin/gh/NikhilAPatel/18/base 2025-08-14T21:20:14.5880473Z * [new branch] gh/NikhilAPatel/18/head -> origin/gh/NikhilAPatel/18/head 2025-08-14T21:20:14.5882268Z * [new branch] gh/NikhilAPatel/18/orig -> origin/gh/NikhilAPatel/18/orig 2025-08-14T21:20:14.5885234Z * [new branch] gh/NikhilAPatel/19/base -> origin/gh/NikhilAPatel/19/base 2025-08-14T21:20:14.5887353Z * [new branch] gh/NikhilAPatel/19/head -> origin/gh/NikhilAPatel/19/head 2025-08-14T21:20:14.5888349Z * [new branch] gh/NikhilAPatel/19/orig -> origin/gh/NikhilAPatel/19/orig 2025-08-14T21:20:14.5891100Z * [new branch] gh/NikhilAPatel/2/base -> origin/gh/NikhilAPatel/2/base 2025-08-14T21:20:14.5892594Z * [new branch] gh/NikhilAPatel/2/head -> origin/gh/NikhilAPatel/2/head 2025-08-14T21:20:14.5895633Z * [new branch] gh/NikhilAPatel/4/base -> origin/gh/NikhilAPatel/4/base 2025-08-14T21:20:14.5897290Z * [new branch] gh/NikhilAPatel/4/head -> origin/gh/NikhilAPatel/4/head 2025-08-14T21:20:14.5900328Z * [new branch] gh/NikhilAPatel/8/base -> origin/gh/NikhilAPatel/8/base 2025-08-14T21:20:14.5901728Z * [new branch] gh/NikhilAPatel/8/head -> origin/gh/NikhilAPatel/8/head 2025-08-14T21:20:14.5903498Z * [new branch] gh/NikhilAPatel/8/orig -> origin/gh/NikhilAPatel/8/orig 2025-08-14T21:20:14.5906467Z * [new branch] gh/NikhilAPatel/9/base -> origin/gh/NikhilAPatel/9/base 2025-08-14T21:20:14.5907870Z * [new branch] gh/NikhilAPatel/9/head -> origin/gh/NikhilAPatel/9/head 2025-08-14T21:20:14.5909663Z * [new branch] gh/NikhilAPatel/9/orig -> origin/gh/NikhilAPatel/9/orig 2025-08-14T21:20:14.5912962Z * [new branch] gh/PaliC/1/base -> origin/gh/PaliC/1/base 2025-08-14T21:20:14.5914361Z * [new branch] gh/PaliC/1/head -> origin/gh/PaliC/1/head 2025-08-14T21:20:14.5916471Z * [new branch] gh/PaliC/1/orig -> origin/gh/PaliC/1/orig 2025-08-14T21:20:14.5919254Z * [new branch] gh/PaliC/12/base -> origin/gh/PaliC/12/base 2025-08-14T21:20:14.5920933Z * [new branch] gh/PaliC/12/head -> origin/gh/PaliC/12/head 2025-08-14T21:20:14.5923244Z * [new branch] gh/PaliC/12/orig -> origin/gh/PaliC/12/orig 2025-08-14T21:20:14.5926073Z * [new branch] gh/PaliC/13/base -> origin/gh/PaliC/13/base 2025-08-14T21:20:14.5927424Z * [new branch] gh/PaliC/13/head -> origin/gh/PaliC/13/head 2025-08-14T21:20:14.5929243Z * [new branch] gh/PaliC/13/orig -> origin/gh/PaliC/13/orig 2025-08-14T21:20:14.5932179Z * [new branch] gh/PaliC/14/base -> origin/gh/PaliC/14/base 2025-08-14T21:20:14.5933538Z * [new branch] gh/PaliC/14/head -> origin/gh/PaliC/14/head 2025-08-14T21:20:14.5935333Z * [new branch] gh/PaliC/14/orig -> origin/gh/PaliC/14/orig 2025-08-14T21:20:14.5938192Z * [new branch] gh/PaliC/15/base -> origin/gh/PaliC/15/base 2025-08-14T21:20:14.5939636Z * [new branch] gh/PaliC/15/head -> origin/gh/PaliC/15/head 2025-08-14T21:20:14.5941390Z * [new branch] gh/PaliC/15/orig -> origin/gh/PaliC/15/orig 2025-08-14T21:20:14.5944201Z * [new branch] gh/PaliC/16/base -> origin/gh/PaliC/16/base 2025-08-14T21:20:14.5945648Z * [new branch] gh/PaliC/16/head -> origin/gh/PaliC/16/head 2025-08-14T21:20:14.5947741Z * [new branch] gh/PaliC/16/orig -> origin/gh/PaliC/16/orig 2025-08-14T21:20:14.5950525Z * [new branch] gh/PaliC/17/base -> origin/gh/PaliC/17/base 2025-08-14T21:20:14.5952003Z * [new branch] gh/PaliC/17/head -> origin/gh/PaliC/17/head 2025-08-14T21:20:14.5953776Z * [new branch] gh/PaliC/17/orig -> origin/gh/PaliC/17/orig 2025-08-14T21:20:14.5956669Z * [new branch] gh/PaliC/18/base -> origin/gh/PaliC/18/base 2025-08-14T21:20:14.5958174Z * [new branch] gh/PaliC/18/head -> origin/gh/PaliC/18/head 2025-08-14T21:20:14.5959958Z * [new branch] gh/PaliC/18/orig -> origin/gh/PaliC/18/orig 2025-08-14T21:20:14.5962796Z * [new branch] gh/PaliC/19/base -> origin/gh/PaliC/19/base 2025-08-14T21:20:14.5964218Z * [new branch] gh/PaliC/19/head -> origin/gh/PaliC/19/head 2025-08-14T21:20:14.5966218Z * [new branch] gh/PaliC/19/orig -> origin/gh/PaliC/19/orig 2025-08-14T21:20:14.5969017Z * [new branch] gh/PaliC/2/base -> origin/gh/PaliC/2/base 2025-08-14T21:20:14.5970447Z * [new branch] gh/PaliC/2/head -> origin/gh/PaliC/2/head 2025-08-14T21:20:14.5972452Z * [new branch] gh/PaliC/2/orig -> origin/gh/PaliC/2/orig 2025-08-14T21:20:14.5975176Z * [new branch] gh/PaliC/20/base -> origin/gh/PaliC/20/base 2025-08-14T21:20:14.5976646Z * [new branch] gh/PaliC/20/head -> origin/gh/PaliC/20/head 2025-08-14T21:20:14.5978381Z * [new branch] gh/PaliC/20/orig -> origin/gh/PaliC/20/orig 2025-08-14T21:20:14.5981128Z * [new branch] gh/PaliC/21/base -> origin/gh/PaliC/21/base 2025-08-14T21:20:14.5982566Z * [new branch] gh/PaliC/21/head -> origin/gh/PaliC/21/head 2025-08-14T21:20:14.5984519Z * [new branch] gh/PaliC/21/orig -> origin/gh/PaliC/21/orig 2025-08-14T21:20:14.5987190Z * [new branch] gh/PaliC/22/base -> origin/gh/PaliC/22/base 2025-08-14T21:20:14.5988725Z * [new branch] gh/PaliC/22/head -> origin/gh/PaliC/22/head 2025-08-14T21:20:14.5990769Z * [new branch] gh/PaliC/22/orig -> origin/gh/PaliC/22/orig 2025-08-14T21:20:14.5993240Z * [new branch] gh/PaliC/23/base -> origin/gh/PaliC/23/base 2025-08-14T21:20:14.5994796Z * [new branch] gh/PaliC/23/head -> origin/gh/PaliC/23/head 2025-08-14T21:20:14.5996609Z * [new branch] gh/PaliC/23/orig -> origin/gh/PaliC/23/orig 2025-08-14T21:20:14.5999349Z * [new branch] gh/PaliC/24/base -> origin/gh/PaliC/24/base 2025-08-14T21:20:14.6000910Z * [new branch] gh/PaliC/24/head -> origin/gh/PaliC/24/head 2025-08-14T21:20:14.6002849Z * [new branch] gh/PaliC/24/orig -> origin/gh/PaliC/24/orig 2025-08-14T21:20:14.6006365Z * [new branch] gh/PaulZhang12/17/base -> origin/gh/PaulZhang12/17/base 2025-08-14T21:20:14.6007753Z * [new branch] gh/PaulZhang12/17/head -> origin/gh/PaulZhang12/17/head 2025-08-14T21:20:14.6010564Z * [new branch] gh/PaulZhang12/18/base -> origin/gh/PaulZhang12/18/base 2025-08-14T21:20:14.6012257Z * [new branch] gh/PaulZhang12/18/head -> origin/gh/PaulZhang12/18/head 2025-08-14T21:20:14.6014087Z * [new branch] gh/PaulZhang12/18/orig -> origin/gh/PaulZhang12/18/orig 2025-08-14T21:20:14.6016861Z * [new branch] gh/PaulZhang12/19/base -> origin/gh/PaulZhang12/19/base 2025-08-14T21:20:14.6018543Z * [new branch] gh/PaulZhang12/19/head -> origin/gh/PaulZhang12/19/head 2025-08-14T21:20:14.6020309Z * [new branch] gh/PaulZhang12/19/orig -> origin/gh/PaulZhang12/19/orig 2025-08-14T21:20:14.6023420Z * [new branch] gh/PaulZhang12/20/base -> origin/gh/PaulZhang12/20/base 2025-08-14T21:20:14.6024889Z * [new branch] gh/PaulZhang12/20/head -> origin/gh/PaulZhang12/20/head 2025-08-14T21:20:14.6027157Z * [new branch] gh/PaulZhang12/20/orig -> origin/gh/PaulZhang12/20/orig 2025-08-14T21:20:14.6029516Z * [new branch] gh/PaulZhang12/21/base -> origin/gh/PaulZhang12/21/base 2025-08-14T21:20:14.6031230Z * [new branch] gh/PaulZhang12/21/head -> origin/gh/PaulZhang12/21/head 2025-08-14T21:20:14.6032970Z * [new branch] gh/PaulZhang12/21/orig -> origin/gh/PaulZhang12/21/orig 2025-08-14T21:20:14.6035798Z * [new branch] gh/PaulZhang12/22/base -> origin/gh/PaulZhang12/22/base 2025-08-14T21:20:14.6037265Z * [new branch] gh/PaulZhang12/22/head -> origin/gh/PaulZhang12/22/head 2025-08-14T21:20:14.6039164Z * [new branch] gh/PaulZhang12/22/orig -> origin/gh/PaulZhang12/22/orig 2025-08-14T21:20:14.6042475Z * [new branch] gh/SamGinzburg/11/base -> origin/gh/SamGinzburg/11/base 2025-08-14T21:20:14.6043940Z * [new branch] gh/SamGinzburg/11/head -> origin/gh/SamGinzburg/11/head 2025-08-14T21:20:14.6048302Z * [new branch] gh/Sidharth123-cpu/24/base -> origin/gh/Sidharth123-cpu/24/base 2025-08-14T21:20:14.6050834Z * [new branch] gh/Sidharth123-cpu/25/base -> origin/gh/Sidharth123-cpu/25/base 2025-08-14T21:20:14.6053081Z * [new branch] gh/Sidharth123-cpu/26/base -> origin/gh/Sidharth123-cpu/26/base 2025-08-14T21:20:14.6056015Z * [new branch] gh/Sidharth123-cpu/27/base -> origin/gh/Sidharth123-cpu/27/base 2025-08-14T21:20:14.6058617Z * [new branch] gh/Sidharth123-cpu/42/base -> origin/gh/Sidharth123-cpu/42/base 2025-08-14T21:20:14.6060101Z * [new branch] gh/Sidharth123-cpu/42/head -> origin/gh/Sidharth123-cpu/42/head 2025-08-14T21:20:14.6061912Z * [new branch] gh/Sidharth123-cpu/42/orig -> origin/gh/Sidharth123-cpu/42/orig 2025-08-14T21:20:14.6064784Z * [new branch] gh/Sidharth123-cpu/43/base -> origin/gh/Sidharth123-cpu/43/base 2025-08-14T21:20:14.6066334Z * [new branch] gh/Sidharth123-cpu/43/head -> origin/gh/Sidharth123-cpu/43/head 2025-08-14T21:20:14.6068121Z * [new branch] gh/Sidharth123-cpu/43/orig -> origin/gh/Sidharth123-cpu/43/orig 2025-08-14T21:20:14.6071282Z * [new branch] gh/Sidharth123-cpu/44/base -> origin/gh/Sidharth123-cpu/44/base 2025-08-14T21:20:14.6072741Z * [new branch] gh/Sidharth123-cpu/44/head -> origin/gh/Sidharth123-cpu/44/head 2025-08-14T21:20:14.6074543Z * [new branch] gh/Sidharth123-cpu/44/orig -> origin/gh/Sidharth123-cpu/44/orig 2025-08-14T21:20:14.6077455Z * [new branch] gh/Sidharth123-cpu/45/base -> origin/gh/Sidharth123-cpu/45/base 2025-08-14T21:20:14.6078806Z * [new branch] gh/Sidharth123-cpu/45/head -> origin/gh/Sidharth123-cpu/45/head 2025-08-14T21:20:14.6080936Z * [new branch] gh/Sidharth123-cpu/45/orig -> origin/gh/Sidharth123-cpu/45/orig 2025-08-14T21:20:14.6084195Z * [new branch] gh/StrongerXi/1/base -> origin/gh/StrongerXi/1/base 2025-08-14T21:20:14.6085773Z * [new branch] gh/StrongerXi/1/head -> origin/gh/StrongerXi/1/head 2025-08-14T21:20:14.6088602Z * [new branch] gh/StrongerXi/103/base -> origin/gh/StrongerXi/103/base 2025-08-14T21:20:14.6090069Z * [new branch] gh/StrongerXi/103/head -> origin/gh/StrongerXi/103/head 2025-08-14T21:20:14.6091940Z * [new branch] gh/StrongerXi/103/orig -> origin/gh/StrongerXi/103/orig 2025-08-14T21:20:14.6095229Z * [new branch] gh/StrongerXi/133/base -> origin/gh/StrongerXi/133/base 2025-08-14T21:20:14.6096698Z * [new branch] gh/StrongerXi/133/head -> origin/gh/StrongerXi/133/head 2025-08-14T21:20:14.6098438Z * [new branch] gh/StrongerXi/133/orig -> origin/gh/StrongerXi/133/orig 2025-08-14T21:20:14.6101269Z * [new branch] gh/StrongerXi/134/base -> origin/gh/StrongerXi/134/base 2025-08-14T21:20:14.6102723Z * [new branch] gh/StrongerXi/134/head -> origin/gh/StrongerXi/134/head 2025-08-14T21:20:14.6104479Z * [new branch] gh/StrongerXi/134/orig -> origin/gh/StrongerXi/134/orig 2025-08-14T21:20:14.6107338Z * [new branch] gh/StrongerXi/135/base -> origin/gh/StrongerXi/135/base 2025-08-14T21:20:14.6108940Z * [new branch] gh/StrongerXi/135/head -> origin/gh/StrongerXi/135/head 2025-08-14T21:20:14.6110747Z * [new branch] gh/StrongerXi/135/orig -> origin/gh/StrongerXi/135/orig 2025-08-14T21:20:14.6113619Z * [new branch] gh/StrongerXi/136/base -> origin/gh/StrongerXi/136/base 2025-08-14T21:20:14.6115035Z * [new branch] gh/StrongerXi/136/head -> origin/gh/StrongerXi/136/head 2025-08-14T21:20:14.6116818Z * [new branch] gh/StrongerXi/136/orig -> origin/gh/StrongerXi/136/orig 2025-08-14T21:20:14.6119733Z * [new branch] gh/StrongerXi/137/base -> origin/gh/StrongerXi/137/base 2025-08-14T21:20:14.6121198Z * [new branch] gh/StrongerXi/137/head -> origin/gh/StrongerXi/137/head 2025-08-14T21:20:14.6122945Z * [new branch] gh/StrongerXi/137/orig -> origin/gh/StrongerXi/137/orig 2025-08-14T21:20:14.6125854Z * [new branch] gh/StrongerXi/138/base -> origin/gh/StrongerXi/138/base 2025-08-14T21:20:14.6127271Z * [new branch] gh/StrongerXi/138/head -> origin/gh/StrongerXi/138/head 2025-08-14T21:20:14.6129125Z * [new branch] gh/StrongerXi/138/orig -> origin/gh/StrongerXi/138/orig 2025-08-14T21:20:14.6131954Z * [new branch] gh/StrongerXi/71/base -> origin/gh/StrongerXi/71/base 2025-08-14T21:20:14.6133434Z * [new branch] gh/StrongerXi/71/head -> origin/gh/StrongerXi/71/head 2025-08-14T21:20:14.6136130Z * [new branch] gh/StrongerXi/72/base -> origin/gh/StrongerXi/72/base 2025-08-14T21:20:14.6137406Z * [new branch] gh/StrongerXi/72/head -> origin/gh/StrongerXi/72/head 2025-08-14T21:20:14.6141207Z * [new branch] gh/XilunWu/131/base -> origin/gh/XilunWu/131/base 2025-08-14T21:20:14.6142795Z * [new branch] gh/XilunWu/131/head -> origin/gh/XilunWu/131/head 2025-08-14T21:20:14.6144636Z * [new branch] gh/XilunWu/131/orig -> origin/gh/XilunWu/131/orig 2025-08-14T21:20:14.6147854Z * [new branch] gh/XilunWu/133/base -> origin/gh/XilunWu/133/base 2025-08-14T21:20:14.6150316Z * [new branch] gh/XilunWu/133/head -> origin/gh/XilunWu/133/head 2025-08-14T21:20:14.6151794Z * [new branch] gh/XilunWu/133/orig -> origin/gh/XilunWu/133/orig 2025-08-14T21:20:14.6154817Z * [new branch] gh/XilunWu/136/base -> origin/gh/XilunWu/136/base 2025-08-14T21:20:14.6156421Z * [new branch] gh/XilunWu/136/head -> origin/gh/XilunWu/136/head 2025-08-14T21:20:14.6158016Z * [new branch] gh/XilunWu/136/orig -> origin/gh/XilunWu/136/orig 2025-08-14T21:20:14.6161082Z * [new branch] gh/XilunWu/139/base -> origin/gh/XilunWu/139/base 2025-08-14T21:20:14.6162461Z * [new branch] gh/XilunWu/139/head -> origin/gh/XilunWu/139/head 2025-08-14T21:20:14.6164133Z * [new branch] gh/XilunWu/139/orig -> origin/gh/XilunWu/139/orig 2025-08-14T21:20:14.6167262Z * [new branch] gh/XilunWu/143/base -> origin/gh/XilunWu/143/base 2025-08-14T21:20:14.6168682Z * [new branch] gh/XilunWu/143/head -> origin/gh/XilunWu/143/head 2025-08-14T21:20:14.6170476Z * [new branch] gh/XilunWu/143/orig -> origin/gh/XilunWu/143/orig 2025-08-14T21:20:14.6173640Z * [new branch] gh/XilunWu/144/base -> origin/gh/XilunWu/144/base 2025-08-14T21:20:14.6175101Z * [new branch] gh/XilunWu/144/head -> origin/gh/XilunWu/144/head 2025-08-14T21:20:14.6176854Z * [new branch] gh/XilunWu/144/orig -> origin/gh/XilunWu/144/orig 2025-08-14T21:20:14.6179816Z * [new branch] gh/XilunWu/145/base -> origin/gh/XilunWu/145/base 2025-08-14T21:20:14.6181196Z * [new branch] gh/XilunWu/145/head -> origin/gh/XilunWu/145/head 2025-08-14T21:20:14.6182949Z * [new branch] gh/XilunWu/145/orig -> origin/gh/XilunWu/145/orig 2025-08-14T21:20:14.6185685Z * [new branch] gh/XilunWu/146/base -> origin/gh/XilunWu/146/base 2025-08-14T21:20:14.6187120Z * [new branch] gh/XilunWu/146/head -> origin/gh/XilunWu/146/head 2025-08-14T21:20:14.6188956Z * [new branch] gh/XilunWu/146/orig -> origin/gh/XilunWu/146/orig 2025-08-14T21:20:14.6191996Z * [new branch] gh/XilunWu/147/base -> origin/gh/XilunWu/147/base 2025-08-14T21:20:14.6193385Z * [new branch] gh/XilunWu/147/head -> origin/gh/XilunWu/147/head 2025-08-14T21:20:14.6195269Z * [new branch] gh/XilunWu/147/orig -> origin/gh/XilunWu/147/orig 2025-08-14T21:20:14.6197581Z * [new branch] gh/XilunWu/148/base -> origin/gh/XilunWu/148/base 2025-08-14T21:20:14.6199360Z * [new branch] gh/XilunWu/148/head -> origin/gh/XilunWu/148/head 2025-08-14T21:20:14.6201137Z * [new branch] gh/XilunWu/148/orig -> origin/gh/XilunWu/148/orig 2025-08-14T21:20:14.6203499Z * [new branch] gh/XilunWu/149/base -> origin/gh/XilunWu/149/base 2025-08-14T21:20:14.6205406Z * [new branch] gh/XilunWu/149/head -> origin/gh/XilunWu/149/head 2025-08-14T21:20:14.6207231Z * [new branch] gh/XilunWu/149/orig -> origin/gh/XilunWu/149/orig 2025-08-14T21:20:14.6209621Z * [new branch] gh/XilunWu/150/base -> origin/gh/XilunWu/150/base 2025-08-14T21:20:14.6211384Z * [new branch] gh/XilunWu/150/head -> origin/gh/XilunWu/150/head 2025-08-14T21:20:14.6213155Z * [new branch] gh/XilunWu/150/orig -> origin/gh/XilunWu/150/orig 2025-08-14T21:20:14.6215787Z * [new branch] gh/XilunWu/151/base -> origin/gh/XilunWu/151/base 2025-08-14T21:20:14.6217569Z * [new branch] gh/XilunWu/151/head -> origin/gh/XilunWu/151/head 2025-08-14T21:20:14.6228676Z * [new branch] gh/XilunWu/151/orig -> origin/gh/XilunWu/151/orig 2025-08-14T21:20:14.6229246Z * [new branch] gh/XilunWu/152/base -> origin/gh/XilunWu/152/base 2025-08-14T21:20:14.6229738Z * [new branch] gh/XilunWu/152/head -> origin/gh/XilunWu/152/head 2025-08-14T21:20:14.6230242Z * [new branch] gh/XilunWu/152/orig -> origin/gh/XilunWu/152/orig 2025-08-14T21:20:14.6230883Z * [new branch] gh/XilunWu/153/base -> origin/gh/XilunWu/153/base 2025-08-14T21:20:14.6231613Z * [new branch] gh/XilunWu/153/head -> origin/gh/XilunWu/153/head 2025-08-14T21:20:14.6232105Z * [new branch] gh/XilunWu/153/orig -> origin/gh/XilunWu/153/orig 2025-08-14T21:20:14.6234193Z * [new branch] gh/XilunWu/154/base -> origin/gh/XilunWu/154/base 2025-08-14T21:20:14.6236070Z * [new branch] gh/XilunWu/154/head -> origin/gh/XilunWu/154/head 2025-08-14T21:20:14.6238019Z * [new branch] gh/XilunWu/154/orig -> origin/gh/XilunWu/154/orig 2025-08-14T21:20:14.6241393Z * [new branch] gh/XilunWu/156/base -> origin/gh/XilunWu/156/base 2025-08-14T21:20:14.6243217Z * [new branch] gh/XilunWu/156/head -> origin/gh/XilunWu/156/head 2025-08-14T21:20:14.6245234Z * [new branch] gh/XilunWu/156/orig -> origin/gh/XilunWu/156/orig 2025-08-14T21:20:14.6248096Z * [new branch] gh/XilunWu/157/base -> origin/gh/XilunWu/157/base 2025-08-14T21:20:14.6249866Z * [new branch] gh/XilunWu/157/head -> origin/gh/XilunWu/157/head 2025-08-14T21:20:14.6251648Z * [new branch] gh/XilunWu/157/orig -> origin/gh/XilunWu/157/orig 2025-08-14T21:20:14.6254424Z * [new branch] gh/XilunWu/158/base -> origin/gh/XilunWu/158/base 2025-08-14T21:20:14.6256191Z * [new branch] gh/XilunWu/158/head -> origin/gh/XilunWu/158/head 2025-08-14T21:20:14.6257930Z * [new branch] gh/XilunWu/158/orig -> origin/gh/XilunWu/158/orig 2025-08-14T21:20:14.6260758Z * [new branch] gh/XilunWu/159/base -> origin/gh/XilunWu/159/base 2025-08-14T21:20:14.6262679Z * [new branch] gh/XilunWu/159/head -> origin/gh/XilunWu/159/head 2025-08-14T21:20:14.6264419Z * [new branch] gh/XilunWu/159/orig -> origin/gh/XilunWu/159/orig 2025-08-14T21:20:14.6267177Z * [new branch] gh/XilunWu/160/base -> origin/gh/XilunWu/160/base 2025-08-14T21:20:14.6269035Z * [new branch] gh/XilunWu/160/head -> origin/gh/XilunWu/160/head 2025-08-14T21:20:14.6270865Z * [new branch] gh/XilunWu/160/orig -> origin/gh/XilunWu/160/orig 2025-08-14T21:20:14.6273575Z * [new branch] gh/XilunWu/161/base -> origin/gh/XilunWu/161/base 2025-08-14T21:20:14.6275386Z * [new branch] gh/XilunWu/161/head -> origin/gh/XilunWu/161/head 2025-08-14T21:20:14.6277160Z * [new branch] gh/XilunWu/161/orig -> origin/gh/XilunWu/161/orig 2025-08-14T21:20:14.6279894Z * [new branch] gh/XilunWu/162/base -> origin/gh/XilunWu/162/base 2025-08-14T21:20:14.6281845Z * [new branch] gh/XilunWu/162/head -> origin/gh/XilunWu/162/head 2025-08-14T21:20:14.6283641Z * [new branch] gh/XilunWu/162/orig -> origin/gh/XilunWu/162/orig 2025-08-14T21:20:14.6286578Z * [new branch] gh/XilunWu/163/base -> origin/gh/XilunWu/163/base 2025-08-14T21:20:14.6288403Z * [new branch] gh/XilunWu/163/head -> origin/gh/XilunWu/163/head 2025-08-14T21:20:14.6290216Z * [new branch] gh/XilunWu/163/orig -> origin/gh/XilunWu/163/orig 2025-08-14T21:20:14.6293520Z * [new branch] gh/XuehaiPan/14/base -> origin/gh/XuehaiPan/14/base 2025-08-14T21:20:14.6295379Z * [new branch] gh/XuehaiPan/14/head -> origin/gh/XuehaiPan/14/head 2025-08-14T21:20:14.6297292Z * [new branch] gh/XuehaiPan/14/orig -> origin/gh/XuehaiPan/14/orig 2025-08-14T21:20:14.6299738Z * [new branch] gh/XuehaiPan/179/base -> origin/gh/XuehaiPan/179/base 2025-08-14T21:20:14.6301607Z * [new branch] gh/XuehaiPan/179/head -> origin/gh/XuehaiPan/179/head 2025-08-14T21:20:14.6303545Z * [new branch] gh/XuehaiPan/179/orig -> origin/gh/XuehaiPan/179/orig 2025-08-14T21:20:14.6334229Z * [new branch] gh/XuehaiPan/189/base -> origin/gh/XuehaiPan/189/base 2025-08-14T21:20:14.6334975Z * [new branch] gh/XuehaiPan/189/head -> origin/gh/XuehaiPan/189/head 2025-08-14T21:20:14.6335536Z * [new branch] gh/XuehaiPan/189/orig -> origin/gh/XuehaiPan/189/orig 2025-08-14T21:20:14.6336053Z * [new branch] gh/XuehaiPan/227/base -> origin/gh/XuehaiPan/227/base 2025-08-14T21:20:14.6336592Z * [new branch] gh/XuehaiPan/227/head -> origin/gh/XuehaiPan/227/head 2025-08-14T21:20:14.6337182Z * [new branch] gh/XuehaiPan/227/orig -> origin/gh/XuehaiPan/227/orig 2025-08-14T21:20:14.6337695Z * [new branch] gh/XuehaiPan/231/base -> origin/gh/XuehaiPan/231/base 2025-08-14T21:20:14.6338218Z * [new branch] gh/XuehaiPan/231/head -> origin/gh/XuehaiPan/231/head 2025-08-14T21:20:14.6338796Z * [new branch] gh/XuehaiPan/231/orig -> origin/gh/XuehaiPan/231/orig 2025-08-14T21:20:14.6339342Z * [new branch] gh/XuehaiPan/232/base -> origin/gh/XuehaiPan/232/base 2025-08-14T21:20:14.6339853Z * [new branch] gh/XuehaiPan/232/head -> origin/gh/XuehaiPan/232/head 2025-08-14T21:20:14.6340372Z * [new branch] gh/XuehaiPan/232/orig -> origin/gh/XuehaiPan/232/orig 2025-08-14T21:20:14.6340967Z * [new branch] gh/XuehaiPan/249/base -> origin/gh/XuehaiPan/249/base 2025-08-14T21:20:14.6341473Z * [new branch] gh/XuehaiPan/249/head -> origin/gh/XuehaiPan/249/head 2025-08-14T21:20:14.6341988Z * [new branch] gh/XuehaiPan/249/orig -> origin/gh/XuehaiPan/249/orig 2025-08-14T21:20:14.6342516Z * [new branch] gh/XuehaiPan/253/base -> origin/gh/XuehaiPan/253/base 2025-08-14T21:20:14.6343036Z * [new branch] gh/XuehaiPan/253/head -> origin/gh/XuehaiPan/253/head 2025-08-14T21:20:14.6343633Z * [new branch] gh/XuehaiPan/253/orig -> origin/gh/XuehaiPan/253/orig 2025-08-14T21:20:14.6344166Z * [new branch] gh/XuehaiPan/254/base -> origin/gh/XuehaiPan/254/base 2025-08-14T21:20:14.6344970Z * [new branch] gh/XuehaiPan/254/head -> origin/gh/XuehaiPan/254/head 2025-08-14T21:20:14.6347426Z * [new branch] gh/XuehaiPan/254/orig -> origin/gh/XuehaiPan/254/orig 2025-08-14T21:20:14.6349600Z * [new branch] gh/XuehaiPan/255/base -> origin/gh/XuehaiPan/255/base 2025-08-14T21:20:14.6351414Z * [new branch] gh/XuehaiPan/255/head -> origin/gh/XuehaiPan/255/head 2025-08-14T21:20:14.6353196Z * [new branch] gh/XuehaiPan/255/orig -> origin/gh/XuehaiPan/255/orig 2025-08-14T21:20:14.6355712Z * [new branch] gh/XuehaiPan/257/base -> origin/gh/XuehaiPan/257/base 2025-08-14T21:20:14.6357467Z * [new branch] gh/XuehaiPan/257/head -> origin/gh/XuehaiPan/257/head 2025-08-14T21:20:14.6359247Z * [new branch] gh/XuehaiPan/257/orig -> origin/gh/XuehaiPan/257/orig 2025-08-14T21:20:14.6361732Z * [new branch] gh/XuehaiPan/271/base -> origin/gh/XuehaiPan/271/base 2025-08-14T21:20:14.6363569Z * [new branch] gh/XuehaiPan/271/head -> origin/gh/XuehaiPan/271/head 2025-08-14T21:20:14.6365632Z * [new branch] gh/XuehaiPan/271/orig -> origin/gh/XuehaiPan/271/orig 2025-08-14T21:20:14.6368134Z * [new branch] gh/XuehaiPan/283/base -> origin/gh/XuehaiPan/283/base 2025-08-14T21:20:14.6369899Z * [new branch] gh/XuehaiPan/283/head -> origin/gh/XuehaiPan/283/head 2025-08-14T21:20:14.6371729Z * [new branch] gh/XuehaiPan/283/orig -> origin/gh/XuehaiPan/283/orig 2025-08-14T21:20:14.6374298Z * [new branch] gh/XuehaiPan/290/base -> origin/gh/XuehaiPan/290/base 2025-08-14T21:20:14.6376286Z * [new branch] gh/XuehaiPan/290/head -> origin/gh/XuehaiPan/290/head 2025-08-14T21:20:14.6377962Z * [new branch] gh/XuehaiPan/290/orig -> origin/gh/XuehaiPan/290/orig 2025-08-14T21:20:14.6380845Z * [new branch] gh/XuehaiPan/328/base -> origin/gh/XuehaiPan/328/base 2025-08-14T21:20:14.6382279Z * [new branch] gh/XuehaiPan/328/head -> origin/gh/XuehaiPan/328/head 2025-08-14T21:20:14.6384022Z * [new branch] gh/XuehaiPan/328/orig -> origin/gh/XuehaiPan/328/orig 2025-08-14T21:20:14.6386565Z * [new branch] gh/XuehaiPan/339/base -> origin/gh/XuehaiPan/339/base 2025-08-14T21:20:14.6388334Z * [new branch] gh/XuehaiPan/339/head -> origin/gh/XuehaiPan/339/head 2025-08-14T21:20:14.6390146Z * [new branch] gh/XuehaiPan/339/orig -> origin/gh/XuehaiPan/339/orig 2025-08-14T21:20:14.6392722Z * [new branch] gh/XuehaiPan/343/base -> origin/gh/XuehaiPan/343/base 2025-08-14T21:20:14.6394667Z * [new branch] gh/XuehaiPan/343/head -> origin/gh/XuehaiPan/343/head 2025-08-14T21:20:14.6396385Z * [new branch] gh/XuehaiPan/343/orig -> origin/gh/XuehaiPan/343/orig 2025-08-14T21:20:14.6399552Z * [new branch] gh/XuehaiPan/344/base -> origin/gh/XuehaiPan/344/base 2025-08-14T21:20:14.6401344Z * [new branch] gh/XuehaiPan/344/head -> origin/gh/XuehaiPan/344/head 2025-08-14T21:20:14.6403136Z * [new branch] gh/XuehaiPan/344/orig -> origin/gh/XuehaiPan/344/orig 2025-08-14T21:20:14.6405950Z * [new branch] gh/XuehaiPan/345/base -> origin/gh/XuehaiPan/345/base 2025-08-14T21:20:14.6407757Z * [new branch] gh/XuehaiPan/345/head -> origin/gh/XuehaiPan/345/head 2025-08-14T21:20:14.6409608Z * [new branch] gh/XuehaiPan/345/orig -> origin/gh/XuehaiPan/345/orig 2025-08-14T21:20:14.6412164Z * [new branch] gh/XuehaiPan/346/base -> origin/gh/XuehaiPan/346/base 2025-08-14T21:20:14.6413973Z * [new branch] gh/XuehaiPan/346/head -> origin/gh/XuehaiPan/346/head 2025-08-14T21:20:14.6416104Z * [new branch] gh/XuehaiPan/346/orig -> origin/gh/XuehaiPan/346/orig 2025-08-14T21:20:14.6418950Z * [new branch] gh/XuehaiPan/347/base -> origin/gh/XuehaiPan/347/base 2025-08-14T21:20:14.6420719Z * [new branch] gh/XuehaiPan/347/head -> origin/gh/XuehaiPan/347/head 2025-08-14T21:20:14.6422611Z * [new branch] gh/XuehaiPan/347/orig -> origin/gh/XuehaiPan/347/orig 2025-08-14T21:20:14.6425202Z * [new branch] gh/XuehaiPan/348/base -> origin/gh/XuehaiPan/348/base 2025-08-14T21:20:14.6426974Z * [new branch] gh/XuehaiPan/348/head -> origin/gh/XuehaiPan/348/head 2025-08-14T21:20:14.6428779Z * [new branch] gh/XuehaiPan/348/orig -> origin/gh/XuehaiPan/348/orig 2025-08-14T21:20:14.6431315Z * [new branch] gh/XuehaiPan/350/base -> origin/gh/XuehaiPan/350/base 2025-08-14T21:20:14.6433055Z * [new branch] gh/XuehaiPan/350/head -> origin/gh/XuehaiPan/350/head 2025-08-14T21:20:14.6434842Z * [new branch] gh/XuehaiPan/350/orig -> origin/gh/XuehaiPan/350/orig 2025-08-14T21:20:14.6437330Z * [new branch] gh/XuehaiPan/352/base -> origin/gh/XuehaiPan/352/base 2025-08-14T21:20:14.6439159Z * [new branch] gh/XuehaiPan/352/head -> origin/gh/XuehaiPan/352/head 2025-08-14T21:20:14.6440893Z * [new branch] gh/XuehaiPan/352/orig -> origin/gh/XuehaiPan/352/orig 2025-08-14T21:20:14.6443632Z * [new branch] gh/XuehaiPan/356/base -> origin/gh/XuehaiPan/356/base 2025-08-14T21:20:14.6445632Z * [new branch] gh/XuehaiPan/356/head -> origin/gh/XuehaiPan/356/head 2025-08-14T21:20:14.6448741Z * [new branch] gh/XuehaiPan/356/orig -> origin/gh/XuehaiPan/356/orig 2025-08-14T21:20:14.6450436Z * [new branch] gh/XuehaiPan/357/base -> origin/gh/XuehaiPan/357/base 2025-08-14T21:20:14.6452274Z * [new branch] gh/XuehaiPan/357/head -> origin/gh/XuehaiPan/357/head 2025-08-14T21:20:14.6453908Z * [new branch] gh/XuehaiPan/357/orig -> origin/gh/XuehaiPan/357/orig 2025-08-14T21:20:14.6456470Z * [new branch] gh/XuehaiPan/358/base -> origin/gh/XuehaiPan/358/base 2025-08-14T21:20:14.6458269Z * [new branch] gh/XuehaiPan/358/head -> origin/gh/XuehaiPan/358/head 2025-08-14T21:20:14.6460073Z * [new branch] gh/XuehaiPan/358/orig -> origin/gh/XuehaiPan/358/orig 2025-08-14T21:20:14.6462588Z * [new branch] gh/XuehaiPan/359/base -> origin/gh/XuehaiPan/359/base 2025-08-14T21:20:14.6464386Z * [new branch] gh/XuehaiPan/359/head -> origin/gh/XuehaiPan/359/head 2025-08-14T21:20:14.6466151Z * [new branch] gh/XuehaiPan/359/orig -> origin/gh/XuehaiPan/359/orig 2025-08-14T21:20:14.6468744Z * [new branch] gh/XuehaiPan/360/base -> origin/gh/XuehaiPan/360/base 2025-08-14T21:20:14.6471065Z * [new branch] gh/XuehaiPan/360/head -> origin/gh/XuehaiPan/360/head 2025-08-14T21:20:14.6472939Z * [new branch] gh/XuehaiPan/360/orig -> origin/gh/XuehaiPan/360/orig 2025-08-14T21:20:14.6475483Z * [new branch] gh/XuehaiPan/365/base -> origin/gh/XuehaiPan/365/base 2025-08-14T21:20:14.6477266Z * [new branch] gh/XuehaiPan/365/head -> origin/gh/XuehaiPan/365/head 2025-08-14T21:20:14.6478925Z * [new branch] gh/XuehaiPan/365/orig -> origin/gh/XuehaiPan/365/orig 2025-08-14T21:20:14.6481451Z * [new branch] gh/XuehaiPan/366/base -> origin/gh/XuehaiPan/366/base 2025-08-14T21:20:14.6483293Z * [new branch] gh/XuehaiPan/366/head -> origin/gh/XuehaiPan/366/head 2025-08-14T21:20:14.6486154Z * [new branch] gh/XuehaiPan/368/base -> origin/gh/XuehaiPan/368/base 2025-08-14T21:20:14.6487856Z * [new branch] gh/XuehaiPan/368/head -> origin/gh/XuehaiPan/368/head 2025-08-14T21:20:14.6489637Z * [new branch] gh/XuehaiPan/368/orig -> origin/gh/XuehaiPan/368/orig 2025-08-14T21:20:14.6492219Z * [new branch] gh/XuehaiPan/369/base -> origin/gh/XuehaiPan/369/base 2025-08-14T21:20:14.6494033Z * [new branch] gh/XuehaiPan/369/head -> origin/gh/XuehaiPan/369/head 2025-08-14T21:20:14.6495870Z * [new branch] gh/XuehaiPan/369/orig -> origin/gh/XuehaiPan/369/orig 2025-08-14T21:20:14.6498378Z * [new branch] gh/XuehaiPan/370/base -> origin/gh/XuehaiPan/370/base 2025-08-14T21:20:14.6500109Z * [new branch] gh/XuehaiPan/370/head -> origin/gh/XuehaiPan/370/head 2025-08-14T21:20:14.6501899Z * [new branch] gh/XuehaiPan/370/orig -> origin/gh/XuehaiPan/370/orig 2025-08-14T21:20:14.6504529Z * [new branch] gh/XuehaiPan/371/base -> origin/gh/XuehaiPan/371/base 2025-08-14T21:20:14.6506359Z * [new branch] gh/XuehaiPan/371/head -> origin/gh/XuehaiPan/371/head 2025-08-14T21:20:14.6508128Z * [new branch] gh/XuehaiPan/371/orig -> origin/gh/XuehaiPan/371/orig 2025-08-14T21:20:14.6510672Z * [new branch] gh/XuehaiPan/372/base -> origin/gh/XuehaiPan/372/base 2025-08-14T21:20:14.6512397Z * [new branch] gh/XuehaiPan/372/head -> origin/gh/XuehaiPan/372/head 2025-08-14T21:20:14.6514152Z * [new branch] gh/XuehaiPan/372/orig -> origin/gh/XuehaiPan/372/orig 2025-08-14T21:20:14.6516716Z * [new branch] gh/XuehaiPan/373/base -> origin/gh/XuehaiPan/373/base 2025-08-14T21:20:14.6518537Z * [new branch] gh/XuehaiPan/373/head -> origin/gh/XuehaiPan/373/head 2025-08-14T21:20:14.6520316Z * [new branch] gh/XuehaiPan/373/orig -> origin/gh/XuehaiPan/373/orig 2025-08-14T21:20:14.6523054Z * [new branch] gh/XuehaiPan/374/base -> origin/gh/XuehaiPan/374/base 2025-08-14T21:20:14.6524834Z * [new branch] gh/XuehaiPan/374/head -> origin/gh/XuehaiPan/374/head 2025-08-14T21:20:14.6526673Z * [new branch] gh/XuehaiPan/374/orig -> origin/gh/XuehaiPan/374/orig 2025-08-14T21:20:14.6529204Z * [new branch] gh/XuehaiPan/375/base -> origin/gh/XuehaiPan/375/base 2025-08-14T21:20:14.6530964Z * [new branch] gh/XuehaiPan/375/head -> origin/gh/XuehaiPan/375/head 2025-08-14T21:20:14.6532798Z * [new branch] gh/XuehaiPan/375/orig -> origin/gh/XuehaiPan/375/orig 2025-08-14T21:20:14.6535294Z * [new branch] gh/XuehaiPan/376/base -> origin/gh/XuehaiPan/376/base 2025-08-14T21:20:14.6537052Z * [new branch] gh/XuehaiPan/376/head -> origin/gh/XuehaiPan/376/head 2025-08-14T21:20:14.6538825Z * [new branch] gh/XuehaiPan/376/orig -> origin/gh/XuehaiPan/376/orig 2025-08-14T21:20:14.6541468Z * [new branch] gh/XuehaiPan/377/base -> origin/gh/XuehaiPan/377/base 2025-08-14T21:20:14.6543249Z * [new branch] gh/XuehaiPan/377/head -> origin/gh/XuehaiPan/377/head 2025-08-14T21:20:14.6545054Z * [new branch] gh/XuehaiPan/377/orig -> origin/gh/XuehaiPan/377/orig 2025-08-14T21:20:14.6547841Z * [new branch] gh/XuehaiPan/378/base -> origin/gh/XuehaiPan/378/base 2025-08-14T21:20:14.6552296Z * [new branch] gh/XuehaiPan/378/head -> origin/gh/XuehaiPan/378/head 2025-08-14T21:20:14.6554112Z * [new branch] gh/XuehaiPan/378/orig -> origin/gh/XuehaiPan/378/orig 2025-08-14T21:20:14.6556688Z * [new branch] gh/XuehaiPan/379/base -> origin/gh/XuehaiPan/379/base 2025-08-14T21:20:14.6558427Z * [new branch] gh/XuehaiPan/379/head -> origin/gh/XuehaiPan/379/head 2025-08-14T21:20:14.6560199Z * [new branch] gh/XuehaiPan/379/orig -> origin/gh/XuehaiPan/379/orig 2025-08-14T21:20:14.6563244Z * [new branch] gh/ZhiweiYan-96/39/base -> origin/gh/ZhiweiYan-96/39/base 2025-08-14T21:20:14.6565349Z * [new branch] gh/ZhiweiYan-96/39/head -> origin/gh/ZhiweiYan-96/39/head 2025-08-14T21:20:14.6567087Z * [new branch] gh/ZhiweiYan-96/39/orig -> origin/gh/ZhiweiYan-96/39/orig 2025-08-14T21:20:14.6569583Z * [new branch] gh/ZhiweiYan-96/44/base -> origin/gh/ZhiweiYan-96/44/base 2025-08-14T21:20:14.6571392Z * [new branch] gh/ZhiweiYan-96/44/head -> origin/gh/ZhiweiYan-96/44/head 2025-08-14T21:20:14.6573742Z * [new branch] gh/ZhiweiYan-96/45/base -> origin/gh/ZhiweiYan-96/45/base 2025-08-14T21:20:14.6575481Z * [new branch] gh/ZhiweiYan-96/45/head -> origin/gh/ZhiweiYan-96/45/head 2025-08-14T21:20:14.6577995Z * [new branch] gh/ZhiweiYan-96/49/base -> origin/gh/ZhiweiYan-96/49/base 2025-08-14T21:20:14.6579583Z * [new branch] gh/ZhiweiYan-96/49/head -> origin/gh/ZhiweiYan-96/49/head 2025-08-14T21:20:14.6582194Z * [new branch] gh/ZhiweiYan-96/62/base -> origin/gh/ZhiweiYan-96/62/base 2025-08-14T21:20:14.6583945Z * [new branch] gh/ZhiweiYan-96/62/head -> origin/gh/ZhiweiYan-96/62/head 2025-08-14T21:20:14.6586389Z * [new branch] gh/ZhiweiYan-96/64/base -> origin/gh/ZhiweiYan-96/64/base 2025-08-14T21:20:14.6588158Z * [new branch] gh/ZhiweiYan-96/64/head -> origin/gh/ZhiweiYan-96/64/head 2025-08-14T21:20:14.6590224Z * [new branch] gh/ZhiweiYan-96/64/orig -> origin/gh/ZhiweiYan-96/64/orig 2025-08-14T21:20:14.6592562Z * [new branch] gh/ZhiweiYan-96/65/base -> origin/gh/ZhiweiYan-96/65/base 2025-08-14T21:20:14.6594280Z * [new branch] gh/ZhiweiYan-96/65/head -> origin/gh/ZhiweiYan-96/65/head 2025-08-14T21:20:14.6596206Z * [new branch] gh/ZhiweiYan-96/65/orig -> origin/gh/ZhiweiYan-96/65/orig 2025-08-14T21:20:14.6598546Z * [new branch] gh/ZhiweiYan-96/66/base -> origin/gh/ZhiweiYan-96/66/base 2025-08-14T21:20:14.6600294Z * [new branch] gh/ZhiweiYan-96/66/head -> origin/gh/ZhiweiYan-96/66/head 2025-08-14T21:20:14.6602677Z * [new branch] gh/ZhiweiYan-96/67/base -> origin/gh/ZhiweiYan-96/67/base 2025-08-14T21:20:14.6604524Z * [new branch] gh/ZhiweiYan-96/67/head -> origin/gh/ZhiweiYan-96/67/head 2025-08-14T21:20:14.6606981Z * [new branch] gh/ZhiweiYan-96/68/base -> origin/gh/ZhiweiYan-96/68/base 2025-08-14T21:20:14.6608892Z * [new branch] gh/ZhiweiYan-96/68/head -> origin/gh/ZhiweiYan-96/68/head 2025-08-14T21:20:14.6610471Z * [new branch] gh/ZhiweiYan-96/68/orig -> origin/gh/ZhiweiYan-96/68/orig 2025-08-14T21:20:14.6613701Z * [new branch] gh/aakhundov/1/base -> origin/gh/aakhundov/1/base 2025-08-14T21:20:14.6616082Z * [new branch] gh/aakhundov/1/head -> origin/gh/aakhundov/1/head 2025-08-14T21:20:14.6618508Z * [new branch] gh/aakhundov/2/base -> origin/gh/aakhundov/2/base 2025-08-14T21:20:14.6620377Z * [new branch] gh/aakhundov/2/head -> origin/gh/aakhundov/2/head 2025-08-14T21:20:14.6622834Z * [new branch] gh/aditew01/openblas -> origin/gh/aditew01/openblas 2025-08-14T21:20:14.6624633Z * [new branch] gh/aditew01/sbgemm -> origin/gh/aditew01/sbgemm 2025-08-14T21:20:14.6626188Z * [new branch] gh/aditew01/vecbf16 -> origin/gh/aditew01/vecbf16 2025-08-14T21:20:14.6628988Z * [new branch] gh/alexbrauckmann/paddedtensor_faketensor_init -> origin/gh/alexbrauckmann/paddedtensor_faketensor_init 2025-08-14T21:20:14.6630463Z * [new branch] gh/alexbrauckmann/paddedtensor_init -> origin/gh/alexbrauckmann/paddedtensor_init 2025-08-14T21:20:14.6632425Z * [new branch] gh/alexbrauckmann/paddedtensor_meta_init -> origin/gh/alexbrauckmann/paddedtensor_meta_init 2025-08-14T21:20:14.6635531Z * [new branch] gh/alexsamardzic/7/base -> origin/gh/alexsamardzic/7/base 2025-08-14T21:20:14.6637262Z * [new branch] gh/alexsamardzic/7/head -> origin/gh/alexsamardzic/7/head 2025-08-14T21:20:14.6639063Z * [new branch] gh/alexsamardzic/7/orig -> origin/gh/alexsamardzic/7/orig 2025-08-14T21:20:14.6641502Z * [new branch] gh/alexsamardzic/8/base -> origin/gh/alexsamardzic/8/base 2025-08-14T21:20:14.6643251Z * [new branch] gh/alexsamardzic/8/head -> origin/gh/alexsamardzic/8/head 2025-08-14T21:20:14.6645226Z * [new branch] gh/alexsamardzic/8/orig -> origin/gh/alexsamardzic/8/orig 2025-08-14T21:20:14.6648547Z * [new branch] gh/amjames/18/base -> origin/gh/amjames/18/base 2025-08-14T21:20:14.6650315Z * [new branch] gh/amjames/18/head -> origin/gh/amjames/18/head 2025-08-14T21:20:14.6652051Z * [new branch] gh/amjames/18/orig -> origin/gh/amjames/18/orig 2025-08-14T21:20:14.6655030Z * [new branch] gh/andrewor14/35/base -> origin/gh/andrewor14/35/base 2025-08-14T21:20:14.6657109Z * [new branch] gh/andrewor14/35/head -> origin/gh/andrewor14/35/head 2025-08-14T21:20:14.6658938Z * [new branch] gh/andrewor14/35/orig -> origin/gh/andrewor14/35/orig 2025-08-14T21:20:14.6661698Z * [new branch] gh/andrewor14/50/base -> origin/gh/andrewor14/50/base 2025-08-14T21:20:14.6663584Z * [new branch] gh/andrewor14/50/head -> origin/gh/andrewor14/50/head 2025-08-14T21:20:14.6665465Z * [new branch] gh/andrewor14/50/orig -> origin/gh/andrewor14/50/orig 2025-08-14T21:20:14.6668669Z * [new branch] gh/andyanwang/1/base -> origin/gh/andyanwang/1/base 2025-08-14T21:20:14.6670561Z * [new branch] gh/andyanwang/1/head -> origin/gh/andyanwang/1/head 2025-08-14T21:20:14.6672161Z * [new branch] gh/andyanwang/1/orig -> origin/gh/andyanwang/1/orig 2025-08-14T21:20:14.6674791Z * [new branch] gh/andyanwang/13/base -> origin/gh/andyanwang/13/base 2025-08-14T21:20:14.6676640Z * [new branch] gh/andyanwang/13/head -> origin/gh/andyanwang/13/head 2025-08-14T21:20:14.6678924Z * [new branch] gh/andyanwang/13/orig -> origin/gh/andyanwang/13/orig 2025-08-14T21:20:14.6680911Z * [new branch] gh/andyanwang/2/base -> origin/gh/andyanwang/2/base 2025-08-14T21:20:14.6682679Z * [new branch] gh/andyanwang/2/head -> origin/gh/andyanwang/2/head 2025-08-14T21:20:14.6684810Z * [new branch] gh/andyanwang/2/orig -> origin/gh/andyanwang/2/orig 2025-08-14T21:20:14.6687399Z * [new branch] gh/andyanwang/28/base -> origin/gh/andyanwang/28/base 2025-08-14T21:20:14.6689167Z * [new branch] gh/andyanwang/28/head -> origin/gh/andyanwang/28/head 2025-08-14T21:20:14.6690937Z * [new branch] gh/andyanwang/28/orig -> origin/gh/andyanwang/28/orig 2025-08-14T21:20:14.6693293Z * [new branch] gh/andyanwang/3/base -> origin/gh/andyanwang/3/base 2025-08-14T21:20:14.6695178Z * [new branch] gh/andyanwang/3/head -> origin/gh/andyanwang/3/head 2025-08-14T21:20:14.6696965Z * [new branch] gh/andyanwang/3/orig -> origin/gh/andyanwang/3/orig 2025-08-14T21:20:14.6699442Z * [new branch] gh/andyanwang/30/base -> origin/gh/andyanwang/30/base 2025-08-14T21:20:14.6701418Z * [new branch] gh/andyanwang/30/orig -> origin/gh/andyanwang/30/orig 2025-08-14T21:20:14.6703967Z * [new branch] gh/andyanwang/31/base -> origin/gh/andyanwang/31/base 2025-08-14T21:20:14.6705970Z * [new branch] gh/andyanwang/31/orig -> origin/gh/andyanwang/31/orig 2025-08-14T21:20:14.6709161Z * [new branch] gh/andyanwang/32/base -> origin/gh/andyanwang/32/base 2025-08-14T21:20:14.6710959Z * [new branch] gh/andyanwang/32/head -> origin/gh/andyanwang/32/head 2025-08-14T21:20:14.6712887Z * [new branch] gh/andyanwang/32/orig -> origin/gh/andyanwang/32/orig 2025-08-14T21:20:14.6715512Z * [new branch] gh/andyanwang/33/base -> origin/gh/andyanwang/33/base 2025-08-14T21:20:14.6717360Z * [new branch] gh/andyanwang/33/head -> origin/gh/andyanwang/33/head 2025-08-14T21:20:14.6719131Z * [new branch] gh/andyanwang/33/orig -> origin/gh/andyanwang/33/orig 2025-08-14T21:20:14.6721596Z * [new branch] gh/andyanwang/34/base -> origin/gh/andyanwang/34/base 2025-08-14T21:20:14.6723456Z * [new branch] gh/andyanwang/34/head -> origin/gh/andyanwang/34/head 2025-08-14T21:20:14.6725511Z * [new branch] gh/andyanwang/34/orig -> origin/gh/andyanwang/34/orig 2025-08-14T21:20:14.6728133Z * [new branch] gh/andyanwang/35/base -> origin/gh/andyanwang/35/base 2025-08-14T21:20:14.6729923Z * [new branch] gh/andyanwang/35/head -> origin/gh/andyanwang/35/head 2025-08-14T21:20:14.6731801Z * [new branch] gh/andyanwang/35/orig -> origin/gh/andyanwang/35/orig 2025-08-14T21:20:14.6734693Z * [new branch] gh/andyanwang/36/base -> origin/gh/andyanwang/36/base 2025-08-14T21:20:14.6736811Z * [new branch] gh/andyanwang/36/head -> origin/gh/andyanwang/36/head 2025-08-14T21:20:14.6738681Z * [new branch] gh/andyanwang/36/orig -> origin/gh/andyanwang/36/orig 2025-08-14T21:20:14.6741349Z * [new branch] gh/andyanwang/37/base -> origin/gh/andyanwang/37/base 2025-08-14T21:20:14.6743488Z * [new branch] gh/andyanwang/37/head -> origin/gh/andyanwang/37/head 2025-08-14T21:20:14.6745441Z * [new branch] gh/andyanwang/37/orig -> origin/gh/andyanwang/37/orig 2025-08-14T21:20:14.6748100Z * [new branch] gh/andyanwang/38/base -> origin/gh/andyanwang/38/base 2025-08-14T21:20:14.6749845Z * [new branch] gh/andyanwang/38/head -> origin/gh/andyanwang/38/head 2025-08-14T21:20:14.6751751Z * [new branch] gh/andyanwang/38/orig -> origin/gh/andyanwang/38/orig 2025-08-14T21:20:14.6754244Z * [new branch] gh/andyanwang/39/base -> origin/gh/andyanwang/39/base 2025-08-14T21:20:14.6756124Z * [new branch] gh/andyanwang/39/head -> origin/gh/andyanwang/39/head 2025-08-14T21:20:14.6757948Z * [new branch] gh/andyanwang/39/orig -> origin/gh/andyanwang/39/orig 2025-08-14T21:20:14.6760599Z * [new branch] gh/andyanwang/4/base -> origin/gh/andyanwang/4/base 2025-08-14T21:20:14.6762319Z * [new branch] gh/andyanwang/4/head -> origin/gh/andyanwang/4/head 2025-08-14T21:20:14.6764160Z * [new branch] gh/andyanwang/4/orig -> origin/gh/andyanwang/4/orig 2025-08-14T21:20:14.6766894Z * [new branch] gh/andyanwang/40/base -> origin/gh/andyanwang/40/base 2025-08-14T21:20:14.6768674Z * [new branch] gh/andyanwang/40/head -> origin/gh/andyanwang/40/head 2025-08-14T21:20:14.6770490Z * [new branch] gh/andyanwang/40/orig -> origin/gh/andyanwang/40/orig 2025-08-14T21:20:14.6773495Z * [new branch] gh/angelayi/102/base -> origin/gh/angelayi/102/base 2025-08-14T21:20:14.6775281Z * [new branch] gh/angelayi/102/head -> origin/gh/angelayi/102/head 2025-08-14T21:20:14.6777095Z * [new branch] gh/angelayi/102/orig -> origin/gh/angelayi/102/orig 2025-08-14T21:20:14.6779456Z * [new branch] gh/angelayi/103/base -> origin/gh/angelayi/103/base 2025-08-14T21:20:14.6781764Z * [new branch] gh/angelayi/103/head -> origin/gh/angelayi/103/head 2025-08-14T21:20:14.6783592Z * [new branch] gh/angelayi/103/orig -> origin/gh/angelayi/103/orig 2025-08-14T21:20:14.6786167Z * [new branch] gh/angelayi/104/base -> origin/gh/angelayi/104/base 2025-08-14T21:20:14.6787996Z * [new branch] gh/angelayi/104/head -> origin/gh/angelayi/104/head 2025-08-14T21:20:14.6789789Z * [new branch] gh/angelayi/104/orig -> origin/gh/angelayi/104/orig 2025-08-14T21:20:14.6792212Z * [new branch] gh/angelayi/105/base -> origin/gh/angelayi/105/base 2025-08-14T21:20:14.6794009Z * [new branch] gh/angelayi/105/head -> origin/gh/angelayi/105/head 2025-08-14T21:20:14.6795810Z * [new branch] gh/angelayi/105/orig -> origin/gh/angelayi/105/orig 2025-08-14T21:20:14.6798211Z * [new branch] gh/angelayi/106/base -> origin/gh/angelayi/106/base 2025-08-14T21:20:14.6799985Z * [new branch] gh/angelayi/106/head -> origin/gh/angelayi/106/head 2025-08-14T21:20:14.6801743Z * [new branch] gh/angelayi/106/orig -> origin/gh/angelayi/106/orig 2025-08-14T21:20:14.6804288Z * [new branch] gh/angelayi/107/base -> origin/gh/angelayi/107/base 2025-08-14T21:20:14.6806303Z * [new branch] gh/angelayi/107/head -> origin/gh/angelayi/107/head 2025-08-14T21:20:14.6808831Z * [new branch] gh/angelayi/108/base -> origin/gh/angelayi/108/base 2025-08-14T21:20:14.6810587Z * [new branch] gh/angelayi/108/head -> origin/gh/angelayi/108/head 2025-08-14T21:20:14.6812378Z * [new branch] gh/angelayi/108/orig -> origin/gh/angelayi/108/orig 2025-08-14T21:20:14.6814853Z * [new branch] gh/angelayi/109/base -> origin/gh/angelayi/109/base 2025-08-14T21:20:14.6816698Z * [new branch] gh/angelayi/109/head -> origin/gh/angelayi/109/head 2025-08-14T21:20:14.6818610Z * [new branch] gh/angelayi/109/orig -> origin/gh/angelayi/109/orig 2025-08-14T21:20:14.6821021Z * [new branch] gh/angelayi/110/base -> origin/gh/angelayi/110/base 2025-08-14T21:20:14.6822801Z * [new branch] gh/angelayi/110/head -> origin/gh/angelayi/110/head 2025-08-14T21:20:14.6824650Z * [new branch] gh/angelayi/110/orig -> origin/gh/angelayi/110/orig 2025-08-14T21:20:14.6827156Z * [new branch] gh/angelayi/97/base -> origin/gh/angelayi/97/base 2025-08-14T21:20:14.6828959Z * [new branch] gh/angelayi/97/head -> origin/gh/angelayi/97/head 2025-08-14T21:20:14.6830747Z * [new branch] gh/angelayi/97/orig -> origin/gh/angelayi/97/orig 2025-08-14T21:20:14.6833809Z * [new branch] gh/ani300/1/base -> origin/gh/ani300/1/base 2025-08-14T21:20:14.6835623Z * [new branch] gh/ani300/1/head -> origin/gh/ani300/1/head 2025-08-14T21:20:14.6837392Z * [new branch] gh/ani300/1/orig -> origin/gh/ani300/1/orig 2025-08-14T21:20:14.6840516Z * [new branch] gh/anijain2305/753/base -> origin/gh/anijain2305/753/base 2025-08-14T21:20:14.6842320Z * [new branch] gh/anijain2305/753/head -> origin/gh/anijain2305/753/head 2025-08-14T21:20:14.6844112Z * [new branch] gh/anijain2305/753/orig -> origin/gh/anijain2305/753/orig 2025-08-14T21:20:14.6846814Z * [new branch] gh/anijain2305/766/base -> origin/gh/anijain2305/766/base 2025-08-14T21:20:14.6848846Z * [new branch] gh/anijain2305/766/head -> origin/gh/anijain2305/766/head 2025-08-14T21:20:14.6850873Z * [new branch] gh/anijain2305/766/orig -> origin/gh/anijain2305/766/orig 2025-08-14T21:20:14.6853450Z * [new branch] gh/anijain2305/790/base -> origin/gh/anijain2305/790/base 2025-08-14T21:20:14.6855223Z * [new branch] gh/anijain2305/790/head -> origin/gh/anijain2305/790/head 2025-08-14T21:20:14.6857388Z * [new branch] gh/anijain2305/790/orig -> origin/gh/anijain2305/790/orig 2025-08-14T21:20:14.6859663Z * [new branch] gh/anijain2305/792/base -> origin/gh/anijain2305/792/base 2025-08-14T21:20:14.6861784Z * [new branch] gh/anijain2305/792/head -> origin/gh/anijain2305/792/head 2025-08-14T21:20:14.6862932Z * [new branch] gh/anijain2305/792/orig -> origin/gh/anijain2305/792/orig 2025-08-14T21:20:14.6865836Z * [new branch] gh/anijain2305/803/base -> origin/gh/anijain2305/803/base 2025-08-14T21:20:14.6867766Z * [new branch] gh/anijain2305/803/head -> origin/gh/anijain2305/803/head 2025-08-14T21:20:14.6869314Z * [new branch] gh/anijain2305/803/orig -> origin/gh/anijain2305/803/orig 2025-08-14T21:20:14.6871799Z * [new branch] gh/anijain2305/804/base -> origin/gh/anijain2305/804/base 2025-08-14T21:20:14.6873328Z * [new branch] gh/anijain2305/804/head -> origin/gh/anijain2305/804/head 2025-08-14T21:20:14.6875370Z * [new branch] gh/anijain2305/804/orig -> origin/gh/anijain2305/804/orig 2025-08-14T21:20:14.6877829Z * [new branch] gh/anijain2305/805/base -> origin/gh/anijain2305/805/base 2025-08-14T21:20:14.6879581Z * [new branch] gh/anijain2305/805/head -> origin/gh/anijain2305/805/head 2025-08-14T21:20:14.6881580Z * [new branch] gh/anijain2305/805/orig -> origin/gh/anijain2305/805/orig 2025-08-14T21:20:14.6884148Z * [new branch] gh/anijain2305/810/base -> origin/gh/anijain2305/810/base 2025-08-14T21:20:14.6886124Z * [new branch] gh/anijain2305/810/head -> origin/gh/anijain2305/810/head 2025-08-14T21:20:14.6887812Z * [new branch] gh/anijain2305/810/orig -> origin/gh/anijain2305/810/orig 2025-08-14T21:20:14.6890505Z * [new branch] gh/anijain2305/811/base -> origin/gh/anijain2305/811/base 2025-08-14T21:20:14.6892086Z * [new branch] gh/anijain2305/811/head -> origin/gh/anijain2305/811/head 2025-08-14T21:20:14.6893889Z * [new branch] gh/anijain2305/811/orig -> origin/gh/anijain2305/811/orig 2025-08-14T21:20:14.6896407Z * [new branch] gh/anijain2305/812/base -> origin/gh/anijain2305/812/base 2025-08-14T21:20:14.6898250Z * [new branch] gh/anijain2305/812/head -> origin/gh/anijain2305/812/head 2025-08-14T21:20:14.6900042Z * [new branch] gh/anijain2305/812/orig -> origin/gh/anijain2305/812/orig 2025-08-14T21:20:14.6902529Z * [new branch] gh/anijain2305/813/base -> origin/gh/anijain2305/813/base 2025-08-14T21:20:14.6904368Z * [new branch] gh/anijain2305/813/head -> origin/gh/anijain2305/813/head 2025-08-14T21:20:14.6906267Z * [new branch] gh/anijain2305/813/orig -> origin/gh/anijain2305/813/orig 2025-08-14T21:20:14.6908798Z * [new branch] gh/anijain2305/814/base -> origin/gh/anijain2305/814/base 2025-08-14T21:20:14.6910463Z * [new branch] gh/anijain2305/814/head -> origin/gh/anijain2305/814/head 2025-08-14T21:20:14.6912315Z * [new branch] gh/anijain2305/814/orig -> origin/gh/anijain2305/814/orig 2025-08-14T21:20:14.6914810Z * [new branch] gh/anijain2305/815/base -> origin/gh/anijain2305/815/base 2025-08-14T21:20:14.6916560Z * [new branch] gh/anijain2305/815/head -> origin/gh/anijain2305/815/head 2025-08-14T21:20:14.6918341Z * [new branch] gh/anijain2305/815/orig -> origin/gh/anijain2305/815/orig 2025-08-14T21:20:14.6920849Z * [new branch] gh/anijain2305/816/base -> origin/gh/anijain2305/816/base 2025-08-14T21:20:14.6922602Z * [new branch] gh/anijain2305/816/head -> origin/gh/anijain2305/816/head 2025-08-14T21:20:14.6925262Z * [new branch] gh/anijain2305/817/base -> origin/gh/anijain2305/817/base 2025-08-14T21:20:14.6927045Z * [new branch] gh/anijain2305/817/head -> origin/gh/anijain2305/817/head 2025-08-14T21:20:14.6928897Z * [new branch] gh/anijain2305/817/orig -> origin/gh/anijain2305/817/orig 2025-08-14T21:20:14.6931454Z * [new branch] gh/anijain2305/818/base -> origin/gh/anijain2305/818/base 2025-08-14T21:20:14.6933349Z * [new branch] gh/anijain2305/818/head -> origin/gh/anijain2305/818/head 2025-08-14T21:20:14.6935078Z * [new branch] gh/anijain2305/818/orig -> origin/gh/anijain2305/818/orig 2025-08-14T21:20:14.6938115Z * [new branch] gh/anijain2305/819/base -> origin/gh/anijain2305/819/base 2025-08-14T21:20:14.6940025Z * [new branch] gh/anijain2305/819/head -> origin/gh/anijain2305/819/head 2025-08-14T21:20:14.6941808Z * [new branch] gh/anijain2305/819/orig -> origin/gh/anijain2305/819/orig 2025-08-14T21:20:14.6944376Z * [new branch] gh/anijain2305/820/base -> origin/gh/anijain2305/820/base 2025-08-14T21:20:14.6946336Z * [new branch] gh/anijain2305/820/head -> origin/gh/anijain2305/820/head 2025-08-14T21:20:14.6948370Z * [new branch] gh/anijain2305/820/orig -> origin/gh/anijain2305/820/orig 2025-08-14T21:20:14.6950831Z * [new branch] gh/anijain2305/821/base -> origin/gh/anijain2305/821/base 2025-08-14T21:20:14.6952634Z * [new branch] gh/anijain2305/821/head -> origin/gh/anijain2305/821/head 2025-08-14T21:20:14.6954422Z * [new branch] gh/anijain2305/821/orig -> origin/gh/anijain2305/821/orig 2025-08-14T21:20:14.6957036Z * [new branch] gh/anijain2305/822/base -> origin/gh/anijain2305/822/base 2025-08-14T21:20:14.6958927Z * [new branch] gh/anijain2305/822/head -> origin/gh/anijain2305/822/head 2025-08-14T21:20:14.6960866Z * [new branch] gh/anijain2305/822/orig -> origin/gh/anijain2305/822/orig 2025-08-14T21:20:14.6963152Z * [new branch] gh/anijain2305/823/base -> origin/gh/anijain2305/823/base 2025-08-14T21:20:14.6965147Z * [new branch] gh/anijain2305/823/head -> origin/gh/anijain2305/823/head 2025-08-14T21:20:14.6966869Z * [new branch] gh/anijain2305/823/orig -> origin/gh/anijain2305/823/orig 2025-08-14T21:20:14.6969526Z * [new branch] gh/anijain2305/824/base -> origin/gh/anijain2305/824/base 2025-08-14T21:20:14.6971498Z * [new branch] gh/anijain2305/824/head -> origin/gh/anijain2305/824/head 2025-08-14T21:20:14.6973234Z * [new branch] gh/anijain2305/824/orig -> origin/gh/anijain2305/824/orig 2025-08-14T21:20:14.6976080Z * [new branch] gh/anijain2305/825/base -> origin/gh/anijain2305/825/base 2025-08-14T21:20:14.6977803Z * [new branch] gh/anijain2305/825/head -> origin/gh/anijain2305/825/head 2025-08-14T21:20:14.6979556Z * [new branch] gh/anijain2305/825/orig -> origin/gh/anijain2305/825/orig 2025-08-14T21:20:14.6982202Z * [new branch] gh/anijain2305/826/base -> origin/gh/anijain2305/826/base 2025-08-14T21:20:14.6984005Z * [new branch] gh/anijain2305/826/head -> origin/gh/anijain2305/826/head 2025-08-14T21:20:14.6985764Z * [new branch] gh/anijain2305/826/orig -> origin/gh/anijain2305/826/orig 2025-08-14T21:20:14.6988197Z * [new branch] gh/anijain2305/827/base -> origin/gh/anijain2305/827/base 2025-08-14T21:20:14.6989985Z * [new branch] gh/anijain2305/827/head -> origin/gh/anijain2305/827/head 2025-08-14T21:20:14.6991751Z * [new branch] gh/anijain2305/827/orig -> origin/gh/anijain2305/827/orig 2025-08-14T21:20:14.6994338Z * [new branch] gh/anijain2305/828/base -> origin/gh/anijain2305/828/base 2025-08-14T21:20:14.6996129Z * [new branch] gh/anijain2305/828/head -> origin/gh/anijain2305/828/head 2025-08-14T21:20:14.6997907Z * [new branch] gh/anijain2305/828/orig -> origin/gh/anijain2305/828/orig 2025-08-14T21:20:14.7000511Z * [new branch] gh/anijain2305/829/base -> origin/gh/anijain2305/829/base 2025-08-14T21:20:14.7002273Z * [new branch] gh/anijain2305/829/head -> origin/gh/anijain2305/829/head 2025-08-14T21:20:14.7004060Z * [new branch] gh/anijain2305/829/orig -> origin/gh/anijain2305/829/orig 2025-08-14T21:20:14.7006857Z * [new branch] gh/anijain2305/830/base -> origin/gh/anijain2305/830/base 2025-08-14T21:20:14.7008625Z * [new branch] gh/anijain2305/830/head -> origin/gh/anijain2305/830/head 2025-08-14T21:20:14.7010522Z * [new branch] gh/anijain2305/830/orig -> origin/gh/anijain2305/830/orig 2025-08-14T21:20:14.7013022Z * [new branch] gh/anijain2305/831/base -> origin/gh/anijain2305/831/base 2025-08-14T21:20:14.7014805Z * [new branch] gh/anijain2305/831/head -> origin/gh/anijain2305/831/head 2025-08-14T21:20:14.7016672Z * [new branch] gh/anijain2305/831/orig -> origin/gh/anijain2305/831/orig 2025-08-14T21:20:14.7019207Z * [new branch] gh/anijain2305/832/base -> origin/gh/anijain2305/832/base 2025-08-14T21:20:14.7021000Z * [new branch] gh/anijain2305/832/head -> origin/gh/anijain2305/832/head 2025-08-14T21:20:14.7022751Z * [new branch] gh/anijain2305/832/orig -> origin/gh/anijain2305/832/orig 2025-08-14T21:20:14.7025312Z * [new branch] gh/anijain2305/833/base -> origin/gh/anijain2305/833/base 2025-08-14T21:20:14.7027106Z * [new branch] gh/anijain2305/833/head -> origin/gh/anijain2305/833/head 2025-08-14T21:20:14.7028864Z * [new branch] gh/anijain2305/833/orig -> origin/gh/anijain2305/833/orig 2025-08-14T21:20:14.7031627Z * [new branch] gh/anijain2305/834/base -> origin/gh/anijain2305/834/base 2025-08-14T21:20:14.7033335Z * [new branch] gh/anijain2305/834/head -> origin/gh/anijain2305/834/head 2025-08-14T21:20:14.7035103Z * [new branch] gh/anijain2305/834/orig -> origin/gh/anijain2305/834/orig 2025-08-14T21:20:14.7037636Z * [new branch] gh/anijain2305/835/base -> origin/gh/anijain2305/835/base 2025-08-14T21:20:14.7039440Z * [new branch] gh/anijain2305/835/head -> origin/gh/anijain2305/835/head 2025-08-14T21:20:14.7041287Z * [new branch] gh/anijain2305/835/orig -> origin/gh/anijain2305/835/orig 2025-08-14T21:20:14.7043812Z * [new branch] gh/anijain2305/836/base -> origin/gh/anijain2305/836/base 2025-08-14T21:20:14.7045822Z * [new branch] gh/anijain2305/836/head -> origin/gh/anijain2305/836/head 2025-08-14T21:20:14.7047655Z * [new branch] gh/anijain2305/836/orig -> origin/gh/anijain2305/836/orig 2025-08-14T21:20:14.7053513Z * [new branch] gh/anijain2305/837/base -> origin/gh/anijain2305/837/base 2025-08-14T21:20:14.7055280Z * [new branch] gh/anijain2305/837/head -> origin/gh/anijain2305/837/head 2025-08-14T21:20:14.7057025Z * [new branch] gh/anijain2305/837/orig -> origin/gh/anijain2305/837/orig 2025-08-14T21:20:14.7059613Z * [new branch] gh/anijain2305/838/base -> origin/gh/anijain2305/838/base 2025-08-14T21:20:14.7061372Z * [new branch] gh/anijain2305/838/head -> origin/gh/anijain2305/838/head 2025-08-14T21:20:14.7063148Z * [new branch] gh/anijain2305/838/orig -> origin/gh/anijain2305/838/orig 2025-08-14T21:20:14.7065993Z * [new branch] gh/anijain2305/839/base -> origin/gh/anijain2305/839/base 2025-08-14T21:20:14.7067764Z * [new branch] gh/anijain2305/839/head -> origin/gh/anijain2305/839/head 2025-08-14T21:20:14.7076142Z * [new branch] gh/anijain2305/839/orig -> origin/gh/anijain2305/839/orig 2025-08-14T21:20:14.7076733Z * [new branch] gh/anijain2305/840/base -> origin/gh/anijain2305/840/base 2025-08-14T21:20:14.7077259Z * [new branch] gh/anijain2305/840/head -> origin/gh/anijain2305/840/head 2025-08-14T21:20:14.7077799Z * [new branch] gh/anijain2305/840/orig -> origin/gh/anijain2305/840/orig 2025-08-14T21:20:14.7078587Z * [new branch] gh/anijain2305/841/base -> origin/gh/anijain2305/841/base 2025-08-14T21:20:14.7080193Z * [new branch] gh/anijain2305/841/head -> origin/gh/anijain2305/841/head 2025-08-14T21:20:14.7081932Z * [new branch] gh/anijain2305/841/orig -> origin/gh/anijain2305/841/orig 2025-08-14T21:20:14.7084514Z * [new branch] gh/anijain2305/842/base -> origin/gh/anijain2305/842/base 2025-08-14T21:20:14.7086482Z * [new branch] gh/anijain2305/842/head -> origin/gh/anijain2305/842/head 2025-08-14T21:20:14.7088312Z * [new branch] gh/anijain2305/842/orig -> origin/gh/anijain2305/842/orig 2025-08-14T21:20:14.7090824Z * [new branch] gh/anijain2305/843/base -> origin/gh/anijain2305/843/base 2025-08-14T21:20:14.7092547Z * [new branch] gh/anijain2305/843/head -> origin/gh/anijain2305/843/head 2025-08-14T21:20:14.7094349Z * [new branch] gh/anijain2305/843/orig -> origin/gh/anijain2305/843/orig 2025-08-14T21:20:14.7096906Z * [new branch] gh/anijain2305/844/base -> origin/gh/anijain2305/844/base 2025-08-14T21:20:14.7098691Z * [new branch] gh/anijain2305/844/head -> origin/gh/anijain2305/844/head 2025-08-14T21:20:14.7100421Z * [new branch] gh/anijain2305/844/orig -> origin/gh/anijain2305/844/orig 2025-08-14T21:20:14.7103067Z * [new branch] gh/anijain2305/845/base -> origin/gh/anijain2305/845/base 2025-08-14T21:20:14.7108902Z * [new branch] gh/anijain2305/845/head -> origin/gh/anijain2305/845/head 2025-08-14T21:20:14.7109715Z * [new branch] gh/anijain2305/845/orig -> origin/gh/anijain2305/845/orig 2025-08-14T21:20:14.7110265Z * [new branch] gh/anijain2305/846/base -> origin/gh/anijain2305/846/base 2025-08-14T21:20:14.7110926Z * [new branch] gh/anijain2305/846/head -> origin/gh/anijain2305/846/head 2025-08-14T21:20:14.7112610Z * [new branch] gh/anijain2305/846/orig -> origin/gh/anijain2305/846/orig 2025-08-14T21:20:14.7115251Z * [new branch] gh/anijain2305/847/base -> origin/gh/anijain2305/847/base 2025-08-14T21:20:14.7117028Z * [new branch] gh/anijain2305/847/head -> origin/gh/anijain2305/847/head 2025-08-14T21:20:14.7118826Z * [new branch] gh/anijain2305/847/orig -> origin/gh/anijain2305/847/orig 2025-08-14T21:20:14.7121477Z * [new branch] gh/anijain2305/848/base -> origin/gh/anijain2305/848/base 2025-08-14T21:20:14.7123291Z * [new branch] gh/anijain2305/848/head -> origin/gh/anijain2305/848/head 2025-08-14T21:20:14.7125195Z * [new branch] gh/anijain2305/848/orig -> origin/gh/anijain2305/848/orig 2025-08-14T21:20:14.7128398Z * [new branch] gh/anjali411/216/base -> origin/gh/anjali411/216/base 2025-08-14T21:20:14.7130178Z * [new branch] gh/anjali411/216/head -> origin/gh/anjali411/216/head 2025-08-14T21:20:14.7132078Z * [new branch] gh/anjali411/216/orig -> origin/gh/anjali411/216/orig 2025-08-14T21:20:14.7135255Z * [new branch] gh/ankitageorge/10/base -> origin/gh/ankitageorge/10/base 2025-08-14T21:20:14.7136980Z * [new branch] gh/ankitageorge/10/head -> origin/gh/ankitageorge/10/head 2025-08-14T21:20:14.7139004Z * [new branch] gh/ankitageorge/10/orig -> origin/gh/ankitageorge/10/orig 2025-08-14T21:20:14.7141373Z * [new branch] gh/ankitageorge/12/base -> origin/gh/ankitageorge/12/base 2025-08-14T21:20:14.7143244Z * [new branch] gh/ankitageorge/12/head -> origin/gh/ankitageorge/12/head 2025-08-14T21:20:14.7145205Z * [new branch] gh/ankitageorge/12/orig -> origin/gh/ankitageorge/12/orig 2025-08-14T21:20:14.7147904Z * [new branch] gh/ankitageorge/13/base -> origin/gh/ankitageorge/13/base 2025-08-14T21:20:14.7149669Z * [new branch] gh/ankitageorge/13/head -> origin/gh/ankitageorge/13/head 2025-08-14T21:20:14.7151468Z * [new branch] gh/ankitageorge/13/orig -> origin/gh/ankitageorge/13/orig 2025-08-14T21:20:14.7154144Z * [new branch] gh/ankitageorge/14/base -> origin/gh/ankitageorge/14/base 2025-08-14T21:20:14.7156647Z * [new branch] gh/ankitageorge/14/head -> origin/gh/ankitageorge/14/head 2025-08-14T21:20:14.7158658Z * [new branch] gh/ankitageorge/14/orig -> origin/gh/ankitageorge/14/orig 2025-08-14T21:20:14.7161221Z * [new branch] gh/ankitageorge/15/base -> origin/gh/ankitageorge/15/base 2025-08-14T21:20:14.7163033Z * [new branch] gh/ankitageorge/15/head -> origin/gh/ankitageorge/15/head 2025-08-14T21:20:14.7165080Z * [new branch] gh/ankitageorge/15/orig -> origin/gh/ankitageorge/15/orig 2025-08-14T21:20:14.7167747Z * [new branch] gh/ankitageorge/16/base -> origin/gh/ankitageorge/16/base 2025-08-14T21:20:14.7169601Z * [new branch] gh/ankitageorge/16/head -> origin/gh/ankitageorge/16/head 2025-08-14T21:20:14.7171530Z * [new branch] gh/ankitageorge/16/orig -> origin/gh/ankitageorge/16/orig 2025-08-14T21:20:14.7174005Z * [new branch] gh/ankitageorge/17/base -> origin/gh/ankitageorge/17/base 2025-08-14T21:20:14.7175922Z * [new branch] gh/ankitageorge/17/head -> origin/gh/ankitageorge/17/head 2025-08-14T21:20:14.7177643Z * [new branch] gh/ankitageorge/17/orig -> origin/gh/ankitageorge/17/orig 2025-08-14T21:20:14.7180590Z * [new branch] gh/ankitageorge/18/base -> origin/gh/ankitageorge/18/base 2025-08-14T21:20:14.7182409Z * [new branch] gh/ankitageorge/18/head -> origin/gh/ankitageorge/18/head 2025-08-14T21:20:14.7184092Z * [new branch] gh/ankitageorge/18/orig -> origin/gh/ankitageorge/18/orig 2025-08-14T21:20:14.7186823Z * [new branch] gh/ankitageorge/19/base -> origin/gh/ankitageorge/19/base 2025-08-14T21:20:14.7188685Z * [new branch] gh/ankitageorge/19/head -> origin/gh/ankitageorge/19/head 2025-08-14T21:20:14.7190433Z * [new branch] gh/ankitageorge/19/orig -> origin/gh/ankitageorge/19/orig 2025-08-14T21:20:14.7192909Z * [new branch] gh/ankitageorge/20/base -> origin/gh/ankitageorge/20/base 2025-08-14T21:20:14.7194876Z * [new branch] gh/ankitageorge/20/head -> origin/gh/ankitageorge/20/head 2025-08-14T21:20:14.7196691Z * [new branch] gh/ankitageorge/20/orig -> origin/gh/ankitageorge/20/orig 2025-08-14T21:20:14.7199252Z * [new branch] gh/ankitageorge/21/base -> origin/gh/ankitageorge/21/base 2025-08-14T21:20:14.7200990Z * [new branch] gh/ankitageorge/21/head -> origin/gh/ankitageorge/21/head 2025-08-14T21:20:14.7202802Z * [new branch] gh/ankitageorge/21/orig -> origin/gh/ankitageorge/21/orig 2025-08-14T21:20:14.7206373Z * [new branch] gh/anshul-si/1/base -> origin/gh/anshul-si/1/base 2025-08-14T21:20:14.7208117Z * [new branch] gh/anshul-si/1/head -> origin/gh/anshul-si/1/head 2025-08-14T21:20:14.7210964Z * [new branch] gh/anshul-si/10/base -> origin/gh/anshul-si/10/base 2025-08-14T21:20:14.7212735Z * [new branch] gh/anshul-si/10/head -> origin/gh/anshul-si/10/head 2025-08-14T21:20:14.7214554Z * [new branch] gh/anshul-si/10/orig -> origin/gh/anshul-si/10/orig 2025-08-14T21:20:14.7217161Z * [new branch] gh/anshul-si/11/base -> origin/gh/anshul-si/11/base 2025-08-14T21:20:14.7219082Z * [new branch] gh/anshul-si/11/head -> origin/gh/anshul-si/11/head 2025-08-14T21:20:14.7220832Z * [new branch] gh/anshul-si/11/orig -> origin/gh/anshul-si/11/orig 2025-08-14T21:20:14.7223234Z * [new branch] gh/anshul-si/12/base -> origin/gh/anshul-si/12/base 2025-08-14T21:20:14.7225012Z * [new branch] gh/anshul-si/12/head -> origin/gh/anshul-si/12/head 2025-08-14T21:20:14.7226847Z * [new branch] gh/anshul-si/12/orig -> origin/gh/anshul-si/12/orig 2025-08-14T21:20:14.7229487Z * [new branch] gh/anshul-si/13/base -> origin/gh/anshul-si/13/base 2025-08-14T21:20:14.7231487Z * [new branch] gh/anshul-si/13/head -> origin/gh/anshul-si/13/head 2025-08-14T21:20:14.7233292Z * [new branch] gh/anshul-si/13/orig -> origin/gh/anshul-si/13/orig 2025-08-14T21:20:14.7236223Z * [new branch] gh/anshul-si/14/base -> origin/gh/anshul-si/14/base 2025-08-14T21:20:14.7237933Z * [new branch] gh/anshul-si/14/head -> origin/gh/anshul-si/14/head 2025-08-14T21:20:14.7239728Z * [new branch] gh/anshul-si/14/orig -> origin/gh/anshul-si/14/orig 2025-08-14T21:20:14.7242066Z * [new branch] gh/anshul-si/15/base -> origin/gh/anshul-si/15/base 2025-08-14T21:20:14.7243937Z * [new branch] gh/anshul-si/15/head -> origin/gh/anshul-si/15/head 2025-08-14T21:20:14.7245819Z * [new branch] gh/anshul-si/15/orig -> origin/gh/anshul-si/15/orig 2025-08-14T21:20:14.7248755Z * [new branch] gh/anshul-si/16/base -> origin/gh/anshul-si/16/base 2025-08-14T21:20:14.7250559Z * [new branch] gh/anshul-si/16/head -> origin/gh/anshul-si/16/head 2025-08-14T21:20:14.7252637Z * [new branch] gh/anshul-si/16/orig -> origin/gh/anshul-si/16/orig 2025-08-14T21:20:14.7255126Z * [new branch] gh/anshul-si/17/base -> origin/gh/anshul-si/17/base 2025-08-14T21:20:14.7256859Z * [new branch] gh/anshul-si/17/head -> origin/gh/anshul-si/17/head 2025-08-14T21:20:14.7258931Z * [new branch] gh/anshul-si/17/orig -> origin/gh/anshul-si/17/orig 2025-08-14T21:20:14.7261542Z * [new branch] gh/anshul-si/18/base -> origin/gh/anshul-si/18/base 2025-08-14T21:20:14.7263259Z * [new branch] gh/anshul-si/18/head -> origin/gh/anshul-si/18/head 2025-08-14T21:20:14.7265054Z * [new branch] gh/anshul-si/18/orig -> origin/gh/anshul-si/18/orig 2025-08-14T21:20:14.7267652Z * [new branch] gh/anshul-si/19/base -> origin/gh/anshul-si/19/base 2025-08-14T21:20:14.7269776Z * [new branch] gh/anshul-si/19/head -> origin/gh/anshul-si/19/head 2025-08-14T21:20:14.7272007Z * [new branch] gh/anshul-si/19/orig -> origin/gh/anshul-si/19/orig 2025-08-14T21:20:14.7274317Z * [new branch] gh/anshul-si/2/base -> origin/gh/anshul-si/2/base 2025-08-14T21:20:14.7276127Z * [new branch] gh/anshul-si/2/head -> origin/gh/anshul-si/2/head 2025-08-14T21:20:14.7278601Z * [new branch] gh/anshul-si/20/base -> origin/gh/anshul-si/20/base 2025-08-14T21:20:14.7280365Z * [new branch] gh/anshul-si/20/head -> origin/gh/anshul-si/20/head 2025-08-14T21:20:14.7282200Z * [new branch] gh/anshul-si/20/orig -> origin/gh/anshul-si/20/orig 2025-08-14T21:20:14.7284745Z * [new branch] gh/anshul-si/21/base -> origin/gh/anshul-si/21/base 2025-08-14T21:20:14.7286639Z * [new branch] gh/anshul-si/21/head -> origin/gh/anshul-si/21/head 2025-08-14T21:20:14.7288468Z * [new branch] gh/anshul-si/21/orig -> origin/gh/anshul-si/21/orig 2025-08-14T21:20:14.7290985Z * [new branch] gh/anshul-si/22/base -> origin/gh/anshul-si/22/base 2025-08-14T21:20:14.7293058Z * [new branch] gh/anshul-si/22/head -> origin/gh/anshul-si/22/head 2025-08-14T21:20:14.7294312Z * [new branch] gh/anshul-si/22/orig -> origin/gh/anshul-si/22/orig 2025-08-14T21:20:14.7297085Z * [new branch] gh/anshul-si/23/base -> origin/gh/anshul-si/23/base 2025-08-14T21:20:14.7298879Z * [new branch] gh/anshul-si/23/head -> origin/gh/anshul-si/23/head 2025-08-14T21:20:14.7300643Z * [new branch] gh/anshul-si/23/orig -> origin/gh/anshul-si/23/orig 2025-08-14T21:20:14.7303239Z * [new branch] gh/anshul-si/24/base -> origin/gh/anshul-si/24/base 2025-08-14T21:20:14.7305167Z * [new branch] gh/anshul-si/24/head -> origin/gh/anshul-si/24/head 2025-08-14T21:20:14.7307160Z * [new branch] gh/anshul-si/24/orig -> origin/gh/anshul-si/24/orig 2025-08-14T21:20:14.7309628Z * [new branch] gh/anshul-si/25/base -> origin/gh/anshul-si/25/base 2025-08-14T21:20:14.7311486Z * [new branch] gh/anshul-si/25/head -> origin/gh/anshul-si/25/head 2025-08-14T21:20:14.7313130Z * [new branch] gh/anshul-si/25/orig -> origin/gh/anshul-si/25/orig 2025-08-14T21:20:14.7316241Z * [new branch] gh/anshul-si/26/base -> origin/gh/anshul-si/26/base 2025-08-14T21:20:14.7317818Z * [new branch] gh/anshul-si/26/head -> origin/gh/anshul-si/26/head 2025-08-14T21:20:14.7319617Z * [new branch] gh/anshul-si/26/orig -> origin/gh/anshul-si/26/orig 2025-08-14T21:20:14.7322060Z * [new branch] gh/anshul-si/27/base -> origin/gh/anshul-si/27/base 2025-08-14T21:20:14.7323868Z * [new branch] gh/anshul-si/27/head -> origin/gh/anshul-si/27/head 2025-08-14T21:20:14.7325791Z * [new branch] gh/anshul-si/27/orig -> origin/gh/anshul-si/27/orig 2025-08-14T21:20:14.7328326Z * [new branch] gh/anshul-si/3/base -> origin/gh/anshul-si/3/base 2025-08-14T21:20:14.7329991Z * [new branch] gh/anshul-si/3/head -> origin/gh/anshul-si/3/head 2025-08-14T21:20:14.7332457Z * [new branch] gh/anshul-si/4/base -> origin/gh/anshul-si/4/base 2025-08-14T21:20:14.7334218Z * [new branch] gh/anshul-si/4/head -> origin/gh/anshul-si/4/head 2025-08-14T21:20:14.7336646Z * [new branch] gh/anshul-si/5/base -> origin/gh/anshul-si/5/base 2025-08-14T21:20:14.7338347Z * [new branch] gh/anshul-si/5/head -> origin/gh/anshul-si/5/head 2025-08-14T21:20:14.7341479Z * [new branch] gh/anshul-si/6/base -> origin/gh/anshul-si/6/base 2025-08-14T21:20:14.7343101Z * [new branch] gh/anshul-si/6/head -> origin/gh/anshul-si/6/head 2025-08-14T21:20:14.7344821Z * [new branch] gh/anshul-si/6/orig -> origin/gh/anshul-si/6/orig 2025-08-14T21:20:14.7347576Z * [new branch] gh/anshul-si/7/base -> origin/gh/anshul-si/7/base 2025-08-14T21:20:14.7351858Z * [new branch] gh/anshul-si/7/head -> origin/gh/anshul-si/7/head 2025-08-14T21:20:14.7353669Z * [new branch] gh/anshul-si/7/orig -> origin/gh/anshul-si/7/orig 2025-08-14T21:20:14.7356287Z * [new branch] gh/anshul-si/8/base -> origin/gh/anshul-si/8/base 2025-08-14T21:20:14.7358204Z * [new branch] gh/anshul-si/8/head -> origin/gh/anshul-si/8/head 2025-08-14T21:20:14.7359960Z * [new branch] gh/anshul-si/8/orig -> origin/gh/anshul-si/8/orig 2025-08-14T21:20:14.7362574Z * [new branch] gh/anshul-si/9/base -> origin/gh/anshul-si/9/base 2025-08-14T21:20:14.7364591Z * [new branch] gh/anshul-si/9/head -> origin/gh/anshul-si/9/head 2025-08-14T21:20:14.7366537Z * [new branch] gh/anshul-si/9/orig -> origin/gh/anshul-si/9/orig 2025-08-14T21:20:14.7369936Z * [new branch] gh/aorenste/132/base -> origin/gh/aorenste/132/base 2025-08-14T21:20:14.7371633Z * [new branch] gh/aorenste/132/head -> origin/gh/aorenste/132/head 2025-08-14T21:20:14.7374114Z * [new branch] gh/aorenste/235/base -> origin/gh/aorenste/235/base 2025-08-14T21:20:14.7375991Z * [new branch] gh/aorenste/235/head -> origin/gh/aorenste/235/head 2025-08-14T21:20:14.7378056Z * [new branch] gh/aorenste/235/orig -> origin/gh/aorenste/235/orig 2025-08-14T21:20:14.7380852Z * [new branch] gh/aorenste/236/base -> origin/gh/aorenste/236/base 2025-08-14T21:20:14.7382699Z * [new branch] gh/aorenste/236/head -> origin/gh/aorenste/236/head 2025-08-14T21:20:14.7384559Z * [new branch] gh/aorenste/236/orig -> origin/gh/aorenste/236/orig 2025-08-14T21:20:14.7387321Z * [new branch] gh/aorenste/237/base -> origin/gh/aorenste/237/base 2025-08-14T21:20:14.7388928Z * [new branch] gh/aorenste/237/head -> origin/gh/aorenste/237/head 2025-08-14T21:20:14.7390730Z * [new branch] gh/aorenste/237/orig -> origin/gh/aorenste/237/orig 2025-08-14T21:20:14.7393454Z * [new branch] gh/aorenste/238/base -> origin/gh/aorenste/238/base 2025-08-14T21:20:14.7395324Z * [new branch] gh/aorenste/238/head -> origin/gh/aorenste/238/head 2025-08-14T21:20:14.7396984Z * [new branch] gh/aorenste/238/orig -> origin/gh/aorenste/238/orig 2025-08-14T21:20:14.7400113Z * [new branch] gh/bdhirsh/650/base -> origin/gh/bdhirsh/650/base 2025-08-14T21:20:14.7402102Z * [new branch] gh/bdhirsh/650/head -> origin/gh/bdhirsh/650/head 2025-08-14T21:20:14.7403904Z * [new branch] gh/bdhirsh/650/orig -> origin/gh/bdhirsh/650/orig 2025-08-14T21:20:14.7406927Z * [new branch] gh/bdhirsh/656/base -> origin/gh/bdhirsh/656/base 2025-08-14T21:20:14.7408550Z * [new branch] gh/bdhirsh/656/head -> origin/gh/bdhirsh/656/head 2025-08-14T21:20:14.7410977Z * [new branch] gh/bdhirsh/657/base -> origin/gh/bdhirsh/657/base 2025-08-14T21:20:14.7412669Z * [new branch] gh/bdhirsh/657/head -> origin/gh/bdhirsh/657/head 2025-08-14T21:20:14.7415042Z * [new branch] gh/bdhirsh/659/base -> origin/gh/bdhirsh/659/base 2025-08-14T21:20:14.7416906Z * [new branch] gh/bdhirsh/659/head -> origin/gh/bdhirsh/659/head 2025-08-14T21:20:14.7418655Z * [new branch] gh/bdhirsh/659/orig -> origin/gh/bdhirsh/659/orig 2025-08-14T21:20:14.7421110Z * [new branch] gh/bdhirsh/663/base -> origin/gh/bdhirsh/663/base 2025-08-14T21:20:14.7422912Z * [new branch] gh/bdhirsh/663/head -> origin/gh/bdhirsh/663/head 2025-08-14T21:20:14.7424666Z * [new branch] gh/bdhirsh/663/orig -> origin/gh/bdhirsh/663/orig 2025-08-14T21:20:14.7427232Z * [new branch] gh/bdhirsh/665/base -> origin/gh/bdhirsh/665/base 2025-08-14T21:20:14.7429015Z * [new branch] gh/bdhirsh/665/head -> origin/gh/bdhirsh/665/head 2025-08-14T21:20:14.7430869Z * [new branch] gh/bdhirsh/665/orig -> origin/gh/bdhirsh/665/orig 2025-08-14T21:20:14.7433456Z * [new branch] gh/bdhirsh/666/base -> origin/gh/bdhirsh/666/base 2025-08-14T21:20:14.7435401Z * [new branch] gh/bdhirsh/666/head -> origin/gh/bdhirsh/666/head 2025-08-14T21:20:14.7436834Z * [new branch] gh/bdhirsh/666/orig -> origin/gh/bdhirsh/666/orig 2025-08-14T21:20:14.7440049Z * [new branch] gh/benjaminglass1/79/base -> origin/gh/benjaminglass1/79/base 2025-08-14T21:20:14.7441771Z * [new branch] gh/benjaminglass1/79/head -> origin/gh/benjaminglass1/79/head 2025-08-14T21:20:14.7443535Z * [new branch] gh/benjaminglass1/79/orig -> origin/gh/benjaminglass1/79/orig 2025-08-14T21:20:14.7446283Z * [new branch] gh/benjaminglass1/86/base -> origin/gh/benjaminglass1/86/base 2025-08-14T21:20:14.7448285Z * [new branch] gh/benjaminglass1/86/head -> origin/gh/benjaminglass1/86/head 2025-08-14T21:20:14.7450134Z * [new branch] gh/benjaminglass1/86/orig -> origin/gh/benjaminglass1/86/orig 2025-08-14T21:20:14.7452570Z * [new branch] gh/benjaminglass1/89/base -> origin/gh/benjaminglass1/89/base 2025-08-14T21:20:14.7454429Z * [new branch] gh/benjaminglass1/89/head -> origin/gh/benjaminglass1/89/head 2025-08-14T21:20:14.7456242Z * [new branch] gh/benjaminglass1/89/orig -> origin/gh/benjaminglass1/89/orig 2025-08-14T21:20:14.7458714Z * [new branch] gh/benjaminglass1/91/base -> origin/gh/benjaminglass1/91/base 2025-08-14T21:20:14.7460458Z * [new branch] gh/benjaminglass1/91/head -> origin/gh/benjaminglass1/91/head 2025-08-14T21:20:14.7462297Z * [new branch] gh/benjaminglass1/91/orig -> origin/gh/benjaminglass1/91/orig 2025-08-14T21:20:14.7464813Z * [new branch] gh/benjaminglass1/93/base -> origin/gh/benjaminglass1/93/base 2025-08-14T21:20:14.7466564Z * [new branch] gh/benjaminglass1/93/head -> origin/gh/benjaminglass1/93/head 2025-08-14T21:20:14.7468425Z * [new branch] gh/benjaminglass1/93/orig -> origin/gh/benjaminglass1/93/orig 2025-08-14T21:20:14.7470909Z * [new branch] gh/benjaminglass1/94/base -> origin/gh/benjaminglass1/94/base 2025-08-14T21:20:14.7472662Z * [new branch] gh/benjaminglass1/94/head -> origin/gh/benjaminglass1/94/head 2025-08-14T21:20:14.7474541Z * [new branch] gh/benjaminglass1/94/orig -> origin/gh/benjaminglass1/94/orig 2025-08-14T21:20:14.7477268Z * [new branch] gh/benjaminglass1/95/base -> origin/gh/benjaminglass1/95/base 2025-08-14T21:20:14.7479036Z * [new branch] gh/benjaminglass1/95/head -> origin/gh/benjaminglass1/95/head 2025-08-14T21:20:14.7480833Z * [new branch] gh/benjaminglass1/95/orig -> origin/gh/benjaminglass1/95/orig 2025-08-14T21:20:14.7483321Z * [new branch] gh/benjaminglass1/96/base -> origin/gh/benjaminglass1/96/base 2025-08-14T21:20:14.7485366Z * [new branch] gh/benjaminglass1/96/head -> origin/gh/benjaminglass1/96/head 2025-08-14T21:20:14.7487050Z * [new branch] gh/benjaminglass1/96/orig -> origin/gh/benjaminglass1/96/orig 2025-08-14T21:20:14.7489600Z * [new branch] gh/benjaminglass1/97/base -> origin/gh/benjaminglass1/97/base 2025-08-14T21:20:14.7491497Z * [new branch] gh/benjaminglass1/97/head -> origin/gh/benjaminglass1/97/head 2025-08-14T21:20:14.7493191Z * [new branch] gh/benjaminglass1/97/orig -> origin/gh/benjaminglass1/97/orig 2025-08-14T21:20:14.7495888Z * [new branch] gh/benjaminglass1/98/base -> origin/gh/benjaminglass1/98/base 2025-08-14T21:20:14.7497537Z * [new branch] gh/benjaminglass1/98/head -> origin/gh/benjaminglass1/98/head 2025-08-14T21:20:14.7499618Z * [new branch] gh/benjaminglass1/98/orig -> origin/gh/benjaminglass1/98/orig 2025-08-14T21:20:14.7502420Z * [new branch] gh/bobrenjc93/478/base -> origin/gh/bobrenjc93/478/base 2025-08-14T21:20:14.7504216Z * [new branch] gh/bobrenjc93/478/head -> origin/gh/bobrenjc93/478/head 2025-08-14T21:20:14.7505987Z * [new branch] gh/bobrenjc93/478/orig -> origin/gh/bobrenjc93/478/orig 2025-08-14T21:20:14.7508417Z * [new branch] gh/bobrenjc93/514/base -> origin/gh/bobrenjc93/514/base 2025-08-14T21:20:14.7510321Z * [new branch] gh/bobrenjc93/514/head -> origin/gh/bobrenjc93/514/head 2025-08-14T21:20:14.7511993Z * [new branch] gh/bobrenjc93/514/orig -> origin/gh/bobrenjc93/514/orig 2025-08-14T21:20:14.7514401Z * [new branch] gh/bobrenjc93/521/base -> origin/gh/bobrenjc93/521/base 2025-08-14T21:20:14.7516162Z * [new branch] gh/bobrenjc93/521/head -> origin/gh/bobrenjc93/521/head 2025-08-14T21:20:14.7517920Z * [new branch] gh/bobrenjc93/521/orig -> origin/gh/bobrenjc93/521/orig 2025-08-14T21:20:14.7520375Z * [new branch] gh/bobrenjc93/522/base -> origin/gh/bobrenjc93/522/base 2025-08-14T21:20:14.7522171Z * [new branch] gh/bobrenjc93/522/head -> origin/gh/bobrenjc93/522/head 2025-08-14T21:20:14.7523946Z * [new branch] gh/bobrenjc93/522/orig -> origin/gh/bobrenjc93/522/orig 2025-08-14T21:20:14.7526633Z * [new branch] gh/bobrenjc93/525/base -> origin/gh/bobrenjc93/525/base 2025-08-14T21:20:14.7528405Z * [new branch] gh/bobrenjc93/525/head -> origin/gh/bobrenjc93/525/head 2025-08-14T21:20:14.7530329Z * [new branch] gh/bobrenjc93/525/orig -> origin/gh/bobrenjc93/525/orig 2025-08-14T21:20:14.7532667Z * [new branch] gh/bobrenjc93/526/base -> origin/gh/bobrenjc93/526/base 2025-08-14T21:20:14.7534398Z * [new branch] gh/bobrenjc93/526/head -> origin/gh/bobrenjc93/526/head 2025-08-14T21:20:14.7536155Z * [new branch] gh/bobrenjc93/526/orig -> origin/gh/bobrenjc93/526/orig 2025-08-14T21:20:14.7538645Z * [new branch] gh/bobrenjc93/527/base -> origin/gh/bobrenjc93/527/base 2025-08-14T21:20:14.7540611Z * [new branch] gh/bobrenjc93/527/head -> origin/gh/bobrenjc93/527/head 2025-08-14T21:20:14.7542287Z * [new branch] gh/bobrenjc93/527/orig -> origin/gh/bobrenjc93/527/orig 2025-08-14T21:20:14.7544762Z * [new branch] gh/bobrenjc93/528/base -> origin/gh/bobrenjc93/528/base 2025-08-14T21:20:14.7546613Z * [new branch] gh/bobrenjc93/528/head -> origin/gh/bobrenjc93/528/head 2025-08-14T21:20:14.7548508Z * [new branch] gh/bobrenjc93/528/orig -> origin/gh/bobrenjc93/528/orig 2025-08-14T21:20:14.7551038Z * [new branch] gh/bobrenjc93/529/base -> origin/gh/bobrenjc93/529/base 2025-08-14T21:20:14.7552825Z * [new branch] gh/bobrenjc93/529/head -> origin/gh/bobrenjc93/529/head 2025-08-14T21:20:14.7554588Z * [new branch] gh/bobrenjc93/529/orig -> origin/gh/bobrenjc93/529/orig 2025-08-14T21:20:14.7557218Z * [new branch] gh/bobrenjc93/534/base -> origin/gh/bobrenjc93/534/base 2025-08-14T21:20:14.7558667Z * [new branch] gh/bobrenjc93/534/head -> origin/gh/bobrenjc93/534/head 2025-08-14T21:20:14.7560649Z * [new branch] gh/bobrenjc93/534/orig -> origin/gh/bobrenjc93/534/orig 2025-08-14T21:20:14.7563163Z * [new branch] gh/bobrenjc93/535/base -> origin/gh/bobrenjc93/535/base 2025-08-14T21:20:14.7564830Z * [new branch] gh/bobrenjc93/535/head -> origin/gh/bobrenjc93/535/head 2025-08-14T21:20:14.7566689Z * [new branch] gh/bobrenjc93/535/orig -> origin/gh/bobrenjc93/535/orig 2025-08-14T21:20:14.7569237Z * [new branch] gh/bobrenjc93/536/base -> origin/gh/bobrenjc93/536/base 2025-08-14T21:20:14.7571031Z * [new branch] gh/bobrenjc93/536/head -> origin/gh/bobrenjc93/536/head 2025-08-14T21:20:14.7572810Z * [new branch] gh/bobrenjc93/536/orig -> origin/gh/bobrenjc93/536/orig 2025-08-14T21:20:14.7575514Z * [new branch] gh/bobrenjc93/537/base -> origin/gh/bobrenjc93/537/base 2025-08-14T21:20:14.7577223Z * [new branch] gh/bobrenjc93/537/head -> origin/gh/bobrenjc93/537/head 2025-08-14T21:20:14.7579017Z * [new branch] gh/bobrenjc93/537/orig -> origin/gh/bobrenjc93/537/orig 2025-08-14T21:20:14.7581485Z * [new branch] gh/bobrenjc93/538/base -> origin/gh/bobrenjc93/538/base 2025-08-14T21:20:14.7583113Z * [new branch] gh/bobrenjc93/538/head -> origin/gh/bobrenjc93/538/head 2025-08-14T21:20:14.7584984Z * [new branch] gh/bobrenjc93/538/orig -> origin/gh/bobrenjc93/538/orig 2025-08-14T21:20:14.7587613Z * [new branch] gh/bobrenjc93/539/base -> origin/gh/bobrenjc93/539/base 2025-08-14T21:20:14.7589451Z * [new branch] gh/bobrenjc93/539/head -> origin/gh/bobrenjc93/539/head 2025-08-14T21:20:14.7592069Z * [new branch] gh/bobrenjc93/539/orig -> origin/gh/bobrenjc93/539/orig 2025-08-14T21:20:14.7594751Z * [new branch] gh/bobrenjc93/540/base -> origin/gh/bobrenjc93/540/base 2025-08-14T21:20:14.7596546Z * [new branch] gh/bobrenjc93/540/head -> origin/gh/bobrenjc93/540/head 2025-08-14T21:20:14.7598460Z * [new branch] gh/bobrenjc93/540/orig -> origin/gh/bobrenjc93/540/orig 2025-08-14T21:20:14.7601527Z * [new branch] gh/bobrenjc93/541/base -> origin/gh/bobrenjc93/541/base 2025-08-14T21:20:14.7603237Z * [new branch] gh/bobrenjc93/541/head -> origin/gh/bobrenjc93/541/head 2025-08-14T21:20:14.7605288Z * [new branch] gh/bobrenjc93/541/orig -> origin/gh/bobrenjc93/541/orig 2025-08-14T21:20:14.7607773Z * [new branch] gh/bobrenjc93/542/base -> origin/gh/bobrenjc93/542/base 2025-08-14T21:20:14.7609473Z * [new branch] gh/bobrenjc93/542/head -> origin/gh/bobrenjc93/542/head 2025-08-14T21:20:14.7611225Z * [new branch] gh/bobrenjc93/542/orig -> origin/gh/bobrenjc93/542/orig 2025-08-14T21:20:14.7613881Z * [new branch] gh/bobrenjc93/543/base -> origin/gh/bobrenjc93/543/base 2025-08-14T21:20:14.7615646Z * [new branch] gh/bobrenjc93/543/head -> origin/gh/bobrenjc93/543/head 2025-08-14T21:20:14.7617331Z * [new branch] gh/bobrenjc93/543/orig -> origin/gh/bobrenjc93/543/orig 2025-08-14T21:20:14.7619873Z * [new branch] gh/bobrenjc93/544/base -> origin/gh/bobrenjc93/544/base 2025-08-14T21:20:14.7621542Z * [new branch] gh/bobrenjc93/544/head -> origin/gh/bobrenjc93/544/head 2025-08-14T21:20:14.7623302Z * [new branch] gh/bobrenjc93/544/orig -> origin/gh/bobrenjc93/544/orig 2025-08-14T21:20:14.7625828Z * [new branch] gh/bobrenjc93/545/base -> origin/gh/bobrenjc93/545/base 2025-08-14T21:20:14.7627794Z * [new branch] gh/bobrenjc93/545/head -> origin/gh/bobrenjc93/545/head 2025-08-14T21:20:14.7629655Z * [new branch] gh/bobrenjc93/545/orig -> origin/gh/bobrenjc93/545/orig 2025-08-14T21:20:14.7632114Z * [new branch] gh/bobrenjc93/546/base -> origin/gh/bobrenjc93/546/base 2025-08-14T21:20:14.7633969Z * [new branch] gh/bobrenjc93/546/head -> origin/gh/bobrenjc93/546/head 2025-08-14T21:20:14.7635925Z * [new branch] gh/bobrenjc93/546/orig -> origin/gh/bobrenjc93/546/orig 2025-08-14T21:20:14.7638779Z * [new branch] gh/bobrenjc93/547/base -> origin/gh/bobrenjc93/547/base 2025-08-14T21:20:14.7640631Z * [new branch] gh/bobrenjc93/547/head -> origin/gh/bobrenjc93/547/head 2025-08-14T21:20:14.7642487Z * [new branch] gh/bobrenjc93/547/orig -> origin/gh/bobrenjc93/547/orig 2025-08-14T21:20:14.7645088Z * [new branch] gh/bobrenjc93/548/base -> origin/gh/bobrenjc93/548/base 2025-08-14T21:20:14.7646920Z * [new branch] gh/bobrenjc93/548/head -> origin/gh/bobrenjc93/548/head 2025-08-14T21:20:14.7648801Z * [new branch] gh/bobrenjc93/548/orig -> origin/gh/bobrenjc93/548/orig 2025-08-14T21:20:14.7651216Z * [new branch] gh/bobrenjc93/549/base -> origin/gh/bobrenjc93/549/base 2025-08-14T21:20:14.7653258Z * [new branch] gh/bobrenjc93/549/head -> origin/gh/bobrenjc93/549/head 2025-08-14T21:20:14.7655021Z * [new branch] gh/bobrenjc93/549/orig -> origin/gh/bobrenjc93/549/orig 2025-08-14T21:20:14.7658041Z * [new branch] gh/briancoutinho/2/base -> origin/gh/briancoutinho/2/base 2025-08-14T21:20:14.7660054Z * [new branch] gh/briancoutinho/2/head -> origin/gh/briancoutinho/2/head 2025-08-14T21:20:14.7663331Z * [new branch] gh/c00w/23/base -> origin/gh/c00w/23/base 2025-08-14T21:20:14.7665036Z * [new branch] gh/c00w/23/head -> origin/gh/c00w/23/head 2025-08-14T21:20:14.7667979Z * [new branch] gh/c00w/38/base -> origin/gh/c00w/38/base 2025-08-14T21:20:14.7669179Z * [new branch] gh/c00w/38/head -> origin/gh/c00w/38/head 2025-08-14T21:20:14.7671389Z * [new branch] gh/c00w/38/orig -> origin/gh/c00w/38/orig 2025-08-14T21:20:14.7674470Z * [new branch] gh/c00w/48/base -> origin/gh/c00w/48/base 2025-08-14T21:20:14.7676430Z * [new branch] gh/c00w/48/head -> origin/gh/c00w/48/head 2025-08-14T21:20:14.7678589Z * [new branch] gh/c00w/48/orig -> origin/gh/c00w/48/orig 2025-08-14T21:20:14.7681299Z * [new branch] gh/c00w/50/base -> origin/gh/c00w/50/base 2025-08-14T21:20:14.7683198Z * [new branch] gh/c00w/50/head -> origin/gh/c00w/50/head 2025-08-14T21:20:14.7685306Z * [new branch] gh/c00w/50/orig -> origin/gh/c00w/50/orig 2025-08-14T21:20:14.7688143Z * [new branch] gh/c00w/51/base -> origin/gh/c00w/51/base 2025-08-14T21:20:14.7690059Z * [new branch] gh/c00w/51/head -> origin/gh/c00w/51/head 2025-08-14T21:20:14.7691989Z * [new branch] gh/c00w/51/orig -> origin/gh/c00w/51/orig 2025-08-14T21:20:14.7694598Z * [new branch] gh/c00w/52/base -> origin/gh/c00w/52/base 2025-08-14T21:20:14.7696503Z * [new branch] gh/c00w/52/head -> origin/gh/c00w/52/head 2025-08-14T21:20:14.7698334Z * [new branch] gh/c00w/52/orig -> origin/gh/c00w/52/orig 2025-08-14T21:20:14.7701022Z * [new branch] gh/c00w/53/base -> origin/gh/c00w/53/base 2025-08-14T21:20:14.7702639Z * [new branch] gh/c00w/53/head -> origin/gh/c00w/53/head 2025-08-14T21:20:14.7704349Z * [new branch] gh/c00w/53/orig -> origin/gh/c00w/53/orig 2025-08-14T21:20:14.7706666Z * [new branch] gh/c00w/54/base -> origin/gh/c00w/54/base 2025-08-14T21:20:14.7708482Z * [new branch] gh/c00w/54/head -> origin/gh/c00w/54/head 2025-08-14T21:20:14.7710272Z * [new branch] gh/c00w/54/orig -> origin/gh/c00w/54/orig 2025-08-14T21:20:14.7713219Z * [new branch] gh/chenmillie/1/base -> origin/gh/chenmillie/1/base 2025-08-14T21:20:14.7715602Z * [new branch] gh/chenmillie/1/head -> origin/gh/chenmillie/1/head 2025-08-14T21:20:14.7717321Z * [new branch] gh/chenmillie/1/orig -> origin/gh/chenmillie/1/orig 2025-08-14T21:20:14.7720329Z * [new branch] gh/clee2000/1/base -> origin/gh/clee2000/1/base 2025-08-14T21:20:14.7722244Z * [new branch] gh/clee2000/1/head -> origin/gh/clee2000/1/head 2025-08-14T21:20:14.7724119Z * [new branch] gh/clee2000/1/orig -> origin/gh/clee2000/1/orig 2025-08-14T21:20:14.7727364Z * [new branch] gh/coconutruben/1/base -> origin/gh/coconutruben/1/base 2025-08-14T21:20:14.7729249Z * [new branch] gh/coconutruben/1/head -> origin/gh/coconutruben/1/head 2025-08-14T21:20:14.7732044Z * [new branch] gh/coconutruben/11/base -> origin/gh/coconutruben/11/base 2025-08-14T21:20:14.7733790Z * [new branch] gh/coconutruben/11/head -> origin/gh/coconutruben/11/head 2025-08-14T21:20:14.7735563Z * [new branch] gh/coconutruben/11/orig -> origin/gh/coconutruben/11/orig 2025-08-14T21:20:14.7738549Z * [new branch] gh/coconutruben/12/base -> origin/gh/coconutruben/12/base 2025-08-14T21:20:14.7740718Z * [new branch] gh/coconutruben/12/head -> origin/gh/coconutruben/12/head 2025-08-14T21:20:14.7742764Z * [new branch] gh/coconutruben/12/orig -> origin/gh/coconutruben/12/orig 2025-08-14T21:20:14.7745397Z * [new branch] gh/coconutruben/13/base -> origin/gh/coconutruben/13/base 2025-08-14T21:20:14.7747411Z * [new branch] gh/coconutruben/13/head -> origin/gh/coconutruben/13/head 2025-08-14T21:20:14.7752201Z * [new branch] gh/coconutruben/13/orig -> origin/gh/coconutruben/13/orig 2025-08-14T21:20:14.7754738Z * [new branch] gh/coconutruben/14/base -> origin/gh/coconutruben/14/base 2025-08-14T21:20:14.7756802Z * [new branch] gh/coconutruben/14/head -> origin/gh/coconutruben/14/head 2025-08-14T21:20:14.7758456Z * [new branch] gh/coconutruben/14/orig -> origin/gh/coconutruben/14/orig 2025-08-14T21:20:14.7761656Z * [new branch] gh/coconutruben/15/base -> origin/gh/coconutruben/15/base 2025-08-14T21:20:14.7763665Z * [new branch] gh/coconutruben/15/head -> origin/gh/coconutruben/15/head 2025-08-14T21:20:14.7765718Z * [new branch] gh/coconutruben/15/orig -> origin/gh/coconutruben/15/orig 2025-08-14T21:20:14.7768228Z * [new branch] gh/coconutruben/16/base -> origin/gh/coconutruben/16/base 2025-08-14T21:20:14.7769951Z * [new branch] gh/coconutruben/16/head -> origin/gh/coconutruben/16/head 2025-08-14T21:20:14.7771778Z * [new branch] gh/coconutruben/16/orig -> origin/gh/coconutruben/16/orig 2025-08-14T21:20:14.7774553Z * [new branch] gh/coconutruben/17/base -> origin/gh/coconutruben/17/base 2025-08-14T21:20:14.7776621Z * [new branch] gh/coconutruben/17/head -> origin/gh/coconutruben/17/head 2025-08-14T21:20:14.7778567Z * [new branch] gh/coconutruben/17/orig -> origin/gh/coconutruben/17/orig 2025-08-14T21:20:14.7780978Z * [new branch] gh/coconutruben/18/base -> origin/gh/coconutruben/18/base 2025-08-14T21:20:14.7782845Z * [new branch] gh/coconutruben/18/head -> origin/gh/coconutruben/18/head 2025-08-14T21:20:14.7784626Z * [new branch] gh/coconutruben/18/orig -> origin/gh/coconutruben/18/orig 2025-08-14T21:20:14.7787360Z * [new branch] gh/coconutruben/19/base -> origin/gh/coconutruben/19/base 2025-08-14T21:20:14.7789230Z * [new branch] gh/coconutruben/19/head -> origin/gh/coconutruben/19/head 2025-08-14T21:20:14.7791116Z * [new branch] gh/coconutruben/19/orig -> origin/gh/coconutruben/19/orig 2025-08-14T21:20:14.7793702Z * [new branch] gh/coconutruben/20/base -> origin/gh/coconutruben/20/base 2025-08-14T21:20:14.7795604Z * [new branch] gh/coconutruben/20/head -> origin/gh/coconutruben/20/head 2025-08-14T21:20:14.7797469Z * [new branch] gh/coconutruben/20/orig -> origin/gh/coconutruben/20/orig 2025-08-14T21:20:14.7800057Z * [new branch] gh/coconutruben/21/base -> origin/gh/coconutruben/21/base 2025-08-14T21:20:14.7801836Z * [new branch] gh/coconutruben/21/head -> origin/gh/coconutruben/21/head 2025-08-14T21:20:14.7803784Z * [new branch] gh/coconutruben/21/orig -> origin/gh/coconutruben/21/orig 2025-08-14T21:20:14.7806455Z * [new branch] gh/coconutruben/22/base -> origin/gh/coconutruben/22/base 2025-08-14T21:20:14.7808267Z * [new branch] gh/coconutruben/22/head -> origin/gh/coconutruben/22/head 2025-08-14T21:20:14.7810281Z * [new branch] gh/coconutruben/22/orig -> origin/gh/coconutruben/22/orig 2025-08-14T21:20:14.7812868Z * [new branch] gh/coconutruben/23/base -> origin/gh/coconutruben/23/base 2025-08-14T21:20:14.7814714Z * [new branch] gh/coconutruben/23/head -> origin/gh/coconutruben/23/head 2025-08-14T21:20:14.7816525Z * [new branch] gh/coconutruben/23/orig -> origin/gh/coconutruben/23/orig 2025-08-14T21:20:14.7819090Z * [new branch] gh/coconutruben/24/base -> origin/gh/coconutruben/24/base 2025-08-14T21:20:14.7820925Z * [new branch] gh/coconutruben/24/head -> origin/gh/coconutruben/24/head 2025-08-14T21:20:14.7822733Z * [new branch] gh/coconutruben/24/orig -> origin/gh/coconutruben/24/orig 2025-08-14T21:20:14.7825970Z * [new branch] gh/coconutruben/25/base -> origin/gh/coconutruben/25/base 2025-08-14T21:20:14.7827937Z * [new branch] gh/coconutruben/25/head -> origin/gh/coconutruben/25/head 2025-08-14T21:20:14.7830239Z * [new branch] gh/coconutruben/25/orig -> origin/gh/coconutruben/25/orig 2025-08-14T21:20:14.7832858Z * [new branch] gh/coconutruben/26/base -> origin/gh/coconutruben/26/base 2025-08-14T21:20:14.7834819Z * [new branch] gh/coconutruben/26/head -> origin/gh/coconutruben/26/head 2025-08-14T21:20:14.7836484Z * [new branch] gh/coconutruben/26/orig -> origin/gh/coconutruben/26/orig 2025-08-14T21:20:14.7838807Z * [new branch] gh/coconutruben/27/base -> origin/gh/coconutruben/27/base 2025-08-14T21:20:14.7840721Z * [new branch] gh/coconutruben/27/head -> origin/gh/coconutruben/27/head 2025-08-14T21:20:14.7850222Z * [new branch] gh/coconutruben/27/orig -> origin/gh/coconutruben/27/orig 2025-08-14T21:20:14.7850876Z * [new branch] gh/codingwithsurya/10/base -> origin/gh/codingwithsurya/10/base 2025-08-14T21:20:14.7851463Z * [new branch] gh/codingwithsurya/10/head -> origin/gh/codingwithsurya/10/head 2025-08-14T21:20:14.7852218Z * [new branch] gh/codingwithsurya/10/orig -> origin/gh/codingwithsurya/10/orig 2025-08-14T21:20:14.7852843Z * [new branch] gh/codingwithsurya/11/base -> origin/gh/codingwithsurya/11/base 2025-08-14T21:20:14.7855185Z * [new branch] gh/codingwithsurya/11/head -> origin/gh/codingwithsurya/11/head 2025-08-14T21:20:14.7856825Z * [new branch] gh/codingwithsurya/11/orig -> origin/gh/codingwithsurya/11/orig 2025-08-14T21:20:14.7859764Z * [new branch] gh/codingwithsurya/12/base -> origin/gh/codingwithsurya/12/base 2025-08-14T21:20:14.7861810Z * [new branch] gh/codingwithsurya/12/head -> origin/gh/codingwithsurya/12/head 2025-08-14T21:20:14.7863746Z * [new branch] gh/codingwithsurya/12/orig -> origin/gh/codingwithsurya/12/orig 2025-08-14T21:20:14.7866311Z * [new branch] gh/codingwithsurya/13/base -> origin/gh/codingwithsurya/13/base 2025-08-14T21:20:14.7868203Z * [new branch] gh/codingwithsurya/13/head -> origin/gh/codingwithsurya/13/head 2025-08-14T21:20:14.7870086Z * [new branch] gh/codingwithsurya/13/orig -> origin/gh/codingwithsurya/13/orig 2025-08-14T21:20:14.7872459Z * [new branch] gh/codingwithsurya/14/base -> origin/gh/codingwithsurya/14/base 2025-08-14T21:20:14.7874244Z * [new branch] gh/codingwithsurya/14/head -> origin/gh/codingwithsurya/14/head 2025-08-14T21:20:14.7876044Z * [new branch] gh/codingwithsurya/14/orig -> origin/gh/codingwithsurya/14/orig 2025-08-14T21:20:14.7878822Z * [new branch] gh/codingwithsurya/15/base -> origin/gh/codingwithsurya/15/base 2025-08-14T21:20:14.7880691Z * [new branch] gh/codingwithsurya/15/head -> origin/gh/codingwithsurya/15/head 2025-08-14T21:20:14.7882588Z * [new branch] gh/codingwithsurya/15/orig -> origin/gh/codingwithsurya/15/orig 2025-08-14T21:20:14.7885497Z * [new branch] gh/codingwithsurya/16/base -> origin/gh/codingwithsurya/16/base 2025-08-14T21:20:14.7887325Z * [new branch] gh/codingwithsurya/16/head -> origin/gh/codingwithsurya/16/head 2025-08-14T21:20:14.7889106Z * [new branch] gh/codingwithsurya/16/orig -> origin/gh/codingwithsurya/16/orig 2025-08-14T21:20:14.7891882Z * [new branch] gh/codingwithsurya/17/base -> origin/gh/codingwithsurya/17/base 2025-08-14T21:20:14.7893759Z * [new branch] gh/codingwithsurya/17/head -> origin/gh/codingwithsurya/17/head 2025-08-14T21:20:14.7895558Z * [new branch] gh/codingwithsurya/17/orig -> origin/gh/codingwithsurya/17/orig 2025-08-14T21:20:14.7898113Z * [new branch] gh/codingwithsurya/18/base -> origin/gh/codingwithsurya/18/base 2025-08-14T21:20:14.7899983Z * [new branch] gh/codingwithsurya/18/head -> origin/gh/codingwithsurya/18/head 2025-08-14T21:20:14.7901742Z * [new branch] gh/codingwithsurya/18/orig -> origin/gh/codingwithsurya/18/orig 2025-08-14T21:20:14.7904371Z * [new branch] gh/codingwithsurya/19/base -> origin/gh/codingwithsurya/19/base 2025-08-14T21:20:14.7906285Z * [new branch] gh/codingwithsurya/19/head -> origin/gh/codingwithsurya/19/head 2025-08-14T21:20:14.7908105Z * [new branch] gh/codingwithsurya/19/orig -> origin/gh/codingwithsurya/19/orig 2025-08-14T21:20:14.7910678Z * [new branch] gh/codingwithsurya/20/base -> origin/gh/codingwithsurya/20/base 2025-08-14T21:20:14.7912451Z * [new branch] gh/codingwithsurya/20/head -> origin/gh/codingwithsurya/20/head 2025-08-14T21:20:14.7914194Z * [new branch] gh/codingwithsurya/20/orig -> origin/gh/codingwithsurya/20/orig 2025-08-14T21:20:14.7917004Z * [new branch] gh/codingwithsurya/21/base -> origin/gh/codingwithsurya/21/base 2025-08-14T21:20:14.7918840Z * [new branch] gh/codingwithsurya/21/head -> origin/gh/codingwithsurya/21/head 2025-08-14T21:20:14.7920769Z * [new branch] gh/codingwithsurya/21/orig -> origin/gh/codingwithsurya/21/orig 2025-08-14T21:20:14.7923244Z * [new branch] gh/codingwithsurya/8/base -> origin/gh/codingwithsurya/8/base 2025-08-14T21:20:14.7925244Z * [new branch] gh/codingwithsurya/8/head -> origin/gh/codingwithsurya/8/head 2025-08-14T21:20:14.7927094Z * [new branch] gh/codingwithsurya/8/orig -> origin/gh/codingwithsurya/8/orig 2025-08-14T21:20:14.7929688Z * [new branch] gh/codingwithsurya/9/base -> origin/gh/codingwithsurya/9/base 2025-08-14T21:20:14.7931613Z * [new branch] gh/codingwithsurya/9/head -> origin/gh/codingwithsurya/9/head 2025-08-14T21:20:14.7933378Z * [new branch] gh/codingwithsurya/9/orig -> origin/gh/codingwithsurya/9/orig 2025-08-14T21:20:14.7936484Z * [new branch] gh/colinchan15/1/base -> origin/gh/colinchan15/1/base 2025-08-14T21:20:14.7938229Z * [new branch] gh/colinchan15/1/head -> origin/gh/colinchan15/1/head 2025-08-14T21:20:14.7940559Z * [new branch] gh/colinchan15/2/base -> origin/gh/colinchan15/2/base 2025-08-14T21:20:14.7942340Z * [new branch] gh/colinchan15/2/head -> origin/gh/colinchan15/2/head 2025-08-14T21:20:14.7944665Z * [new branch] gh/colinchan15/3/base -> origin/gh/colinchan15/3/base 2025-08-14T21:20:14.7946415Z * [new branch] gh/colinchan15/3/head -> origin/gh/colinchan15/3/head 2025-08-14T21:20:14.7950982Z * [new branch] gh/colinchan15/4/base -> origin/gh/colinchan15/4/base 2025-08-14T21:20:14.7952246Z * [new branch] gh/colinchan15/4/head -> origin/gh/colinchan15/4/head 2025-08-14T21:20:14.7954817Z * [new branch] gh/colinchan15/5/base -> origin/gh/colinchan15/5/base 2025-08-14T21:20:14.7957176Z * [new branch] gh/colinchan15/5/head -> origin/gh/colinchan15/5/head 2025-08-14T21:20:14.7959550Z * [new branch] gh/colinchan15/6/base -> origin/gh/colinchan15/6/base 2025-08-14T21:20:14.7961345Z * [new branch] gh/colinchan15/6/head -> origin/gh/colinchan15/6/head 2025-08-14T21:20:14.7964432Z * [new branch] gh/davidberard98/351/base -> origin/gh/davidberard98/351/base 2025-08-14T21:20:14.7966423Z * [new branch] gh/davidberard98/351/head -> origin/gh/davidberard98/351/head 2025-08-14T21:20:14.7968200Z * [new branch] gh/davidberard98/351/orig -> origin/gh/davidberard98/351/orig 2025-08-14T21:20:14.7970497Z * [new branch] gh/davidberard98/353/base -> origin/gh/davidberard98/353/base 2025-08-14T21:20:14.7972267Z * [new branch] gh/davidberard98/353/head -> origin/gh/davidberard98/353/head 2025-08-14T21:20:14.7974037Z * [new branch] gh/davidberard98/353/orig -> origin/gh/davidberard98/353/orig 2025-08-14T21:20:14.7976558Z * [new branch] gh/davidberard98/356/base -> origin/gh/davidberard98/356/base 2025-08-14T21:20:14.7978461Z * [new branch] gh/davidberard98/356/head -> origin/gh/davidberard98/356/head 2025-08-14T21:20:14.7980211Z * [new branch] gh/davidberard98/356/orig -> origin/gh/davidberard98/356/orig 2025-08-14T21:20:14.7982787Z * [new branch] gh/davidberard98/382/base -> origin/gh/davidberard98/382/base 2025-08-14T21:20:14.7984755Z * [new branch] gh/davidberard98/382/head -> origin/gh/davidberard98/382/head 2025-08-14T21:20:14.7986590Z * [new branch] gh/davidberard98/382/orig -> origin/gh/davidberard98/382/orig 2025-08-14T21:20:14.7989076Z * [new branch] gh/davidberard98/386/base -> origin/gh/davidberard98/386/base 2025-08-14T21:20:14.7990945Z * [new branch] gh/davidberard98/386/head -> origin/gh/davidberard98/386/head 2025-08-14T21:20:14.7992734Z * [new branch] gh/davidberard98/386/orig -> origin/gh/davidberard98/386/orig 2025-08-14T21:20:14.7995373Z * [new branch] gh/davidberard98/389/base -> origin/gh/davidberard98/389/base 2025-08-14T21:20:14.7997039Z * [new branch] gh/davidberard98/389/head -> origin/gh/davidberard98/389/head 2025-08-14T21:20:14.7998800Z * [new branch] gh/davidberard98/389/orig -> origin/gh/davidberard98/389/orig 2025-08-14T21:20:14.8001417Z * [new branch] gh/davidberard98/390/base -> origin/gh/davidberard98/390/base 2025-08-14T21:20:14.8003242Z * [new branch] gh/davidberard98/390/head -> origin/gh/davidberard98/390/head 2025-08-14T21:20:14.8005188Z * [new branch] gh/davidberard98/390/orig -> origin/gh/davidberard98/390/orig 2025-08-14T21:20:14.8007722Z * [new branch] gh/davidberard98/391/base -> origin/gh/davidberard98/391/base 2025-08-14T21:20:14.8009576Z * [new branch] gh/davidberard98/391/head -> origin/gh/davidberard98/391/head 2025-08-14T21:20:14.8011335Z * [new branch] gh/davidberard98/391/orig -> origin/gh/davidberard98/391/orig 2025-08-14T21:20:14.8013897Z * [new branch] gh/davidberard98/392/base -> origin/gh/davidberard98/392/base 2025-08-14T21:20:14.8015701Z * [new branch] gh/davidberard98/392/head -> origin/gh/davidberard98/392/head 2025-08-14T21:20:14.8017496Z * [new branch] gh/davidberard98/392/orig -> origin/gh/davidberard98/392/orig 2025-08-14T21:20:14.8019981Z * [new branch] gh/davidberard98/393/base -> origin/gh/davidberard98/393/base 2025-08-14T21:20:14.8021766Z * [new branch] gh/davidberard98/393/head -> origin/gh/davidberard98/393/head 2025-08-14T21:20:14.8023525Z * [new branch] gh/davidberard98/393/orig -> origin/gh/davidberard98/393/orig 2025-08-14T21:20:14.8026152Z * [new branch] gh/davidberard98/394/base -> origin/gh/davidberard98/394/base 2025-08-14T21:20:14.8028085Z * [new branch] gh/davidberard98/394/head -> origin/gh/davidberard98/394/head 2025-08-14T21:20:14.8029966Z * [new branch] gh/davidberard98/394/orig -> origin/gh/davidberard98/394/orig 2025-08-14T21:20:14.8032588Z * [new branch] gh/davidberard98/395/base -> origin/gh/davidberard98/395/base 2025-08-14T21:20:14.8034591Z * [new branch] gh/davidberard98/395/head -> origin/gh/davidberard98/395/head 2025-08-14T21:20:14.8036482Z * [new branch] gh/davidberard98/395/orig -> origin/gh/davidberard98/395/orig 2025-08-14T21:20:14.8038885Z * [new branch] gh/davidberard98/396/base -> origin/gh/davidberard98/396/base 2025-08-14T21:20:14.8040705Z * [new branch] gh/davidberard98/396/head -> origin/gh/davidberard98/396/head 2025-08-14T21:20:14.8042469Z * [new branch] gh/davidberard98/396/orig -> origin/gh/davidberard98/396/orig 2025-08-14T21:20:14.8045321Z * [new branch] gh/davidberard98/397/base -> origin/gh/davidberard98/397/base 2025-08-14T21:20:14.8047327Z * [new branch] gh/davidberard98/397/head -> origin/gh/davidberard98/397/head 2025-08-14T21:20:14.8049550Z * [new branch] gh/davidberard98/397/orig -> origin/gh/davidberard98/397/orig 2025-08-14T21:20:14.8052102Z * [new branch] gh/davidberard98/398/base -> origin/gh/davidberard98/398/base 2025-08-14T21:20:14.8053921Z * [new branch] gh/davidberard98/398/head -> origin/gh/davidberard98/398/head 2025-08-14T21:20:14.8055610Z * [new branch] gh/davidberard98/398/orig -> origin/gh/davidberard98/398/orig 2025-08-14T21:20:14.8058668Z * [new branch] gh/desertfire/570/base -> origin/gh/desertfire/570/base 2025-08-14T21:20:14.8060552Z * [new branch] gh/desertfire/570/head -> origin/gh/desertfire/570/head 2025-08-14T21:20:14.8062400Z * [new branch] gh/desertfire/570/orig -> origin/gh/desertfire/570/orig 2025-08-14T21:20:14.8064694Z * [new branch] gh/desertfire/572/base -> origin/gh/desertfire/572/base 2025-08-14T21:20:14.8066708Z * [new branch] gh/desertfire/572/head -> origin/gh/desertfire/572/head 2025-08-14T21:20:14.8068460Z * [new branch] gh/desertfire/572/orig -> origin/gh/desertfire/572/orig 2025-08-14T21:20:14.8071000Z * [new branch] gh/desertfire/589/base -> origin/gh/desertfire/589/base 2025-08-14T21:20:14.8072842Z * [new branch] gh/desertfire/589/head -> origin/gh/desertfire/589/head 2025-08-14T21:20:14.8074699Z * [new branch] gh/desertfire/589/orig -> origin/gh/desertfire/589/orig 2025-08-14T21:20:14.8077538Z * [new branch] gh/desertfire/590/base -> origin/gh/desertfire/590/base 2025-08-14T21:20:14.8079265Z * [new branch] gh/desertfire/590/head -> origin/gh/desertfire/590/head 2025-08-14T21:20:14.8081040Z * [new branch] gh/desertfire/590/orig -> origin/gh/desertfire/590/orig 2025-08-14T21:20:14.8083477Z * [new branch] gh/desertfire/591/base -> origin/gh/desertfire/591/base 2025-08-14T21:20:14.8085467Z * [new branch] gh/desertfire/591/head -> origin/gh/desertfire/591/head 2025-08-14T21:20:14.8087270Z * [new branch] gh/desertfire/591/orig -> origin/gh/desertfire/591/orig 2025-08-14T21:20:14.8089729Z * [new branch] gh/desertfire/592/base -> origin/gh/desertfire/592/base 2025-08-14T21:20:14.8091474Z * [new branch] gh/desertfire/592/head -> origin/gh/desertfire/592/head 2025-08-14T21:20:14.8093382Z * [new branch] gh/desertfire/592/orig -> origin/gh/desertfire/592/orig 2025-08-14T21:20:14.8095871Z * [new branch] gh/desertfire/593/base -> origin/gh/desertfire/593/base 2025-08-14T21:20:14.8097660Z * [new branch] gh/desertfire/593/head -> origin/gh/desertfire/593/head 2025-08-14T21:20:14.8099543Z * [new branch] gh/desertfire/593/orig -> origin/gh/desertfire/593/orig 2025-08-14T21:20:14.8102059Z * [new branch] gh/desertfire/594/base -> origin/gh/desertfire/594/base 2025-08-14T21:20:14.8103828Z * [new branch] gh/desertfire/594/head -> origin/gh/desertfire/594/head 2025-08-14T21:20:14.8105692Z * [new branch] gh/desertfire/594/orig -> origin/gh/desertfire/594/orig 2025-08-14T21:20:14.8108133Z * [new branch] gh/desertfire/595/base -> origin/gh/desertfire/595/base 2025-08-14T21:20:14.8109876Z * [new branch] gh/desertfire/595/head -> origin/gh/desertfire/595/head 2025-08-14T21:20:14.8111725Z * [new branch] gh/desertfire/595/orig -> origin/gh/desertfire/595/orig 2025-08-14T21:20:14.8114177Z * [new branch] gh/desertfire/596/base -> origin/gh/desertfire/596/base 2025-08-14T21:20:14.8116021Z * [new branch] gh/desertfire/596/head -> origin/gh/desertfire/596/head 2025-08-14T21:20:14.8117882Z * [new branch] gh/desertfire/596/orig -> origin/gh/desertfire/596/orig 2025-08-14T21:20:14.8120366Z * [new branch] gh/desertfire/597/base -> origin/gh/desertfire/597/base 2025-08-14T21:20:14.8122172Z * [new branch] gh/desertfire/597/head -> origin/gh/desertfire/597/head 2025-08-14T21:20:14.8123980Z * [new branch] gh/desertfire/597/orig -> origin/gh/desertfire/597/orig 2025-08-14T21:20:14.8127470Z * [new branch] gh/dharakk/1/base -> origin/gh/dharakk/1/base 2025-08-14T21:20:14.8129268Z * [new branch] gh/dharakk/1/head -> origin/gh/dharakk/1/head 2025-08-14T21:20:14.8132009Z * [new branch] gh/dharakk/4/base -> origin/gh/dharakk/4/base 2025-08-14T21:20:14.8133461Z * [new branch] gh/dharakk/4/head -> origin/gh/dharakk/4/head 2025-08-14T21:20:14.8135279Z * [new branch] gh/dharakk/4/orig -> origin/gh/dharakk/4/orig 2025-08-14T21:20:14.8138357Z * [new branch] gh/drisspg/140/base -> origin/gh/drisspg/140/base 2025-08-14T21:20:14.8140052Z * [new branch] gh/drisspg/140/head -> origin/gh/drisspg/140/head 2025-08-14T21:20:14.8141896Z * [new branch] gh/drisspg/140/orig -> origin/gh/drisspg/140/orig 2025-08-14T21:20:14.8144341Z * [new branch] gh/drisspg/149/base -> origin/gh/drisspg/149/base 2025-08-14T21:20:14.8146182Z * [new branch] gh/drisspg/149/head -> origin/gh/drisspg/149/head 2025-08-14T21:20:14.8148180Z * [new branch] gh/drisspg/149/orig -> origin/gh/drisspg/149/orig 2025-08-14T21:20:14.8152838Z * [new branch] gh/drisspg/150/base -> origin/gh/drisspg/150/base 2025-08-14T21:20:14.8154584Z * [new branch] gh/drisspg/150/head -> origin/gh/drisspg/150/head 2025-08-14T21:20:14.8156306Z * [new branch] gh/drisspg/150/orig -> origin/gh/drisspg/150/orig 2025-08-14T21:20:14.8158743Z * [new branch] gh/drisspg/151/base -> origin/gh/drisspg/151/base 2025-08-14T21:20:14.8160489Z * [new branch] gh/drisspg/151/head -> origin/gh/drisspg/151/head 2025-08-14T21:20:14.8162287Z * [new branch] gh/drisspg/151/orig -> origin/gh/drisspg/151/orig 2025-08-14T21:20:14.8164808Z * [new branch] gh/drisspg/158/base -> origin/gh/drisspg/158/base 2025-08-14T21:20:14.8166636Z * [new branch] gh/drisspg/158/head -> origin/gh/drisspg/158/head 2025-08-14T21:20:14.8168459Z * [new branch] gh/drisspg/158/orig -> origin/gh/drisspg/158/orig 2025-08-14T21:20:14.8170859Z * [new branch] gh/drisspg/159/base -> origin/gh/drisspg/159/base 2025-08-14T21:20:14.8172623Z * [new branch] gh/drisspg/159/head -> origin/gh/drisspg/159/head 2025-08-14T21:20:14.8174359Z * [new branch] gh/drisspg/159/orig -> origin/gh/drisspg/159/orig 2025-08-14T21:20:14.8177002Z * [new branch] gh/drisspg/166/base -> origin/gh/drisspg/166/base 2025-08-14T21:20:14.8178689Z * [new branch] gh/drisspg/166/head -> origin/gh/drisspg/166/head 2025-08-14T21:20:14.8180433Z * [new branch] gh/drisspg/166/orig -> origin/gh/drisspg/166/orig 2025-08-14T21:20:14.8183083Z * [new branch] gh/drisspg/168/base -> origin/gh/drisspg/168/base 2025-08-14T21:20:14.8184762Z * [new branch] gh/drisspg/168/head -> origin/gh/drisspg/168/head 2025-08-14T21:20:14.8186504Z * [new branch] gh/drisspg/168/orig -> origin/gh/drisspg/168/orig 2025-08-14T21:20:14.8188914Z * [new branch] gh/drisspg/169/base -> origin/gh/drisspg/169/base 2025-08-14T21:20:14.8190763Z * [new branch] gh/drisspg/169/head -> origin/gh/drisspg/169/head 2025-08-14T21:20:14.8192596Z * [new branch] gh/drisspg/169/orig -> origin/gh/drisspg/169/orig 2025-08-14T21:20:14.8195220Z * [new branch] gh/drisspg/170/base -> origin/gh/drisspg/170/base 2025-08-14T21:20:14.8196819Z * [new branch] gh/drisspg/170/head -> origin/gh/drisspg/170/head 2025-08-14T21:20:14.8198566Z * [new branch] gh/drisspg/170/orig -> origin/gh/drisspg/170/orig 2025-08-14T21:20:14.8201104Z * [new branch] gh/drisspg/171/base -> origin/gh/drisspg/171/base 2025-08-14T21:20:14.8202865Z * [new branch] gh/drisspg/171/head -> origin/gh/drisspg/171/head 2025-08-14T21:20:14.8204887Z * [new branch] gh/drisspg/171/orig -> origin/gh/drisspg/171/orig 2025-08-14T21:20:14.8207415Z * [new branch] gh/drisspg/172/base -> origin/gh/drisspg/172/base 2025-08-14T21:20:14.8209117Z * [new branch] gh/drisspg/172/head -> origin/gh/drisspg/172/head 2025-08-14T21:20:14.8210922Z * [new branch] gh/drisspg/172/orig -> origin/gh/drisspg/172/orig 2025-08-14T21:20:14.8213461Z * [new branch] gh/drisspg/173/base -> origin/gh/drisspg/173/base 2025-08-14T21:20:14.8215201Z * [new branch] gh/drisspg/173/head -> origin/gh/drisspg/173/head 2025-08-14T21:20:14.8216862Z * [new branch] gh/drisspg/173/orig -> origin/gh/drisspg/173/orig 2025-08-14T21:20:14.8219341Z * [new branch] gh/drisspg/174/base -> origin/gh/drisspg/174/base 2025-08-14T21:20:14.8221129Z * [new branch] gh/drisspg/174/head -> origin/gh/drisspg/174/head 2025-08-14T21:20:14.8222906Z * [new branch] gh/drisspg/174/orig -> origin/gh/drisspg/174/orig 2025-08-14T21:20:14.8225823Z * [new branch] gh/drisspg/175/base -> origin/gh/drisspg/175/base 2025-08-14T21:20:14.8227565Z * [new branch] gh/drisspg/175/head -> origin/gh/drisspg/175/head 2025-08-14T21:20:14.8229336Z * [new branch] gh/drisspg/175/orig -> origin/gh/drisspg/175/orig 2025-08-14T21:20:14.8231962Z * [new branch] gh/drisspg/176/base -> origin/gh/drisspg/176/base 2025-08-14T21:20:14.8233651Z * [new branch] gh/drisspg/176/head -> origin/gh/drisspg/176/head 2025-08-14T21:20:14.8235475Z * [new branch] gh/drisspg/176/orig -> origin/gh/drisspg/176/orig 2025-08-14T21:20:14.8238481Z * [new branch] gh/drisspg/177/base -> origin/gh/drisspg/177/base 2025-08-14T21:20:14.8240221Z * [new branch] gh/drisspg/177/head -> origin/gh/drisspg/177/head 2025-08-14T21:20:14.8241987Z * [new branch] gh/drisspg/177/orig -> origin/gh/drisspg/177/orig 2025-08-14T21:20:14.8245067Z * [new branch] gh/drisspg/178/base -> origin/gh/drisspg/178/base 2025-08-14T21:20:14.8247172Z * [new branch] gh/drisspg/178/head -> origin/gh/drisspg/178/head 2025-08-14T21:20:14.8248765Z * [new branch] gh/drisspg/178/orig -> origin/gh/drisspg/178/orig 2025-08-14T21:20:14.8251271Z * [new branch] gh/drisspg/179/base -> origin/gh/drisspg/179/base 2025-08-14T21:20:14.8253050Z * [new branch] gh/drisspg/179/head -> origin/gh/drisspg/179/head 2025-08-14T21:20:14.8254966Z * [new branch] gh/drisspg/179/orig -> origin/gh/drisspg/179/orig 2025-08-14T21:20:14.8257368Z * [new branch] gh/drisspg/180/base -> origin/gh/drisspg/180/base 2025-08-14T21:20:14.8259141Z * [new branch] gh/drisspg/180/head -> origin/gh/drisspg/180/head 2025-08-14T21:20:14.8261121Z * [new branch] gh/drisspg/180/orig -> origin/gh/drisspg/180/orig 2025-08-14T21:20:14.8263381Z * [new branch] gh/drisspg/181/base -> origin/gh/drisspg/181/base 2025-08-14T21:20:14.8265266Z * [new branch] gh/drisspg/181/head -> origin/gh/drisspg/181/head 2025-08-14T21:20:14.8267060Z * [new branch] gh/drisspg/181/orig -> origin/gh/drisspg/181/orig 2025-08-14T21:20:14.8269605Z * [new branch] gh/drisspg/182/base -> origin/gh/drisspg/182/base 2025-08-14T21:20:14.8271534Z * [new branch] gh/drisspg/182/head -> origin/gh/drisspg/182/head 2025-08-14T21:20:14.8274122Z * [new branch] gh/drisspg/183/base -> origin/gh/drisspg/183/base 2025-08-14T21:20:14.8275753Z * [new branch] gh/drisspg/183/head -> origin/gh/drisspg/183/head 2025-08-14T21:20:14.8278062Z * [new branch] gh/drisspg/184/base -> origin/gh/drisspg/184/base 2025-08-14T21:20:14.8279699Z * [new branch] gh/drisspg/184/head -> origin/gh/drisspg/184/head 2025-08-14T21:20:14.8282848Z * [new branch] gh/drisspg/185/base -> origin/gh/drisspg/185/base 2025-08-14T21:20:14.8284731Z * [new branch] gh/drisspg/185/head -> origin/gh/drisspg/185/head 2025-08-14T21:20:14.8288193Z * [new branch] gh/dsjohns2/1/base -> origin/gh/dsjohns2/1/base 2025-08-14T21:20:14.8289810Z * [new branch] gh/dsjohns2/1/head -> origin/gh/dsjohns2/1/head 2025-08-14T21:20:14.8292835Z * [new branch] gh/eellison/784/base -> origin/gh/eellison/784/base 2025-08-14T21:20:14.8294580Z * [new branch] gh/eellison/784/head -> origin/gh/eellison/784/head 2025-08-14T21:20:14.8296455Z * [new branch] gh/eellison/784/orig -> origin/gh/eellison/784/orig 2025-08-14T21:20:14.8299192Z * [new branch] gh/eellison/785/base -> origin/gh/eellison/785/base 2025-08-14T21:20:14.8300956Z * [new branch] gh/eellison/785/head -> origin/gh/eellison/785/head 2025-08-14T21:20:14.8302820Z * [new branch] gh/eellison/785/orig -> origin/gh/eellison/785/orig 2025-08-14T21:20:14.8305255Z * [new branch] gh/eellison/789/base -> origin/gh/eellison/789/base 2025-08-14T21:20:14.8307028Z * [new branch] gh/eellison/789/head -> origin/gh/eellison/789/head 2025-08-14T21:20:14.8308769Z * [new branch] gh/eellison/789/orig -> origin/gh/eellison/789/orig 2025-08-14T21:20:14.8311260Z * [new branch] gh/eellison/800/base -> origin/gh/eellison/800/base 2025-08-14T21:20:14.8313039Z * [new branch] gh/eellison/800/head -> origin/gh/eellison/800/head 2025-08-14T21:20:14.8314781Z * [new branch] gh/eellison/800/orig -> origin/gh/eellison/800/orig 2025-08-14T21:20:14.8317255Z * [new branch] gh/eellison/801/base -> origin/gh/eellison/801/base 2025-08-14T21:20:14.8319073Z * [new branch] gh/eellison/801/head -> origin/gh/eellison/801/head 2025-08-14T21:20:14.8320925Z * [new branch] gh/eellison/801/orig -> origin/gh/eellison/801/orig 2025-08-14T21:20:14.8323411Z * [new branch] gh/eellison/802/base -> origin/gh/eellison/802/base 2025-08-14T21:20:14.8325393Z * [new branch] gh/eellison/802/head -> origin/gh/eellison/802/head 2025-08-14T21:20:14.8327231Z * [new branch] gh/eellison/802/orig -> origin/gh/eellison/802/orig 2025-08-14T21:20:14.8329811Z * [new branch] gh/eellison/805/base -> origin/gh/eellison/805/base 2025-08-14T21:20:14.8331621Z * [new branch] gh/eellison/805/head -> origin/gh/eellison/805/head 2025-08-14T21:20:14.8333329Z * [new branch] gh/eellison/805/orig -> origin/gh/eellison/805/orig 2025-08-14T21:20:14.8335866Z * [new branch] gh/eellison/808/base -> origin/gh/eellison/808/base 2025-08-14T21:20:14.8337633Z * [new branch] gh/eellison/808/head -> origin/gh/eellison/808/head 2025-08-14T21:20:14.8339452Z * [new branch] gh/eellison/808/orig -> origin/gh/eellison/808/orig 2025-08-14T21:20:14.8342208Z * [new branch] gh/eellison/809/base -> origin/gh/eellison/809/base 2025-08-14T21:20:14.8343647Z * [new branch] gh/eellison/809/head -> origin/gh/eellison/809/head 2025-08-14T21:20:14.8345626Z * [new branch] gh/eellison/809/orig -> origin/gh/eellison/809/orig 2025-08-14T21:20:14.8348155Z * [new branch] gh/eellison/810/base -> origin/gh/eellison/810/base 2025-08-14T21:20:14.8350185Z * [new branch] gh/eellison/810/head -> origin/gh/eellison/810/head 2025-08-14T21:20:14.8351899Z * [new branch] gh/eellison/810/orig -> origin/gh/eellison/810/orig 2025-08-14T21:20:14.8354346Z * [new branch] gh/eellison/811/base -> origin/gh/eellison/811/base 2025-08-14T21:20:14.8356251Z * [new branch] gh/eellison/811/head -> origin/gh/eellison/811/head 2025-08-14T21:20:14.8357886Z * [new branch] gh/eellison/811/orig -> origin/gh/eellison/811/orig 2025-08-14T21:20:14.8360481Z * [new branch] gh/eellison/812/base -> origin/gh/eellison/812/base 2025-08-14T21:20:14.8362116Z * [new branch] gh/eellison/812/head -> origin/gh/eellison/812/head 2025-08-14T21:20:14.8363845Z * [new branch] gh/eellison/812/orig -> origin/gh/eellison/812/orig 2025-08-14T21:20:14.8366526Z * [new branch] gh/eellison/813/base -> origin/gh/eellison/813/base 2025-08-14T21:20:14.8368342Z * [new branch] gh/eellison/813/head -> origin/gh/eellison/813/head 2025-08-14T21:20:14.8370501Z * [new branch] gh/eellison/813/orig -> origin/gh/eellison/813/orig 2025-08-14T21:20:14.8373523Z * [new branch] gh/etaf/132/base -> origin/gh/etaf/132/base 2025-08-14T21:20:14.8375630Z * [new branch] gh/etaf/132/head -> origin/gh/etaf/132/head 2025-08-14T21:20:14.8377116Z * [new branch] gh/etaf/132/orig -> origin/gh/etaf/132/orig 2025-08-14T21:20:14.8379724Z * [new branch] gh/etaf/138/base -> origin/gh/etaf/138/base 2025-08-14T21:20:14.8381447Z * [new branch] gh/etaf/138/head -> origin/gh/etaf/138/head 2025-08-14T21:20:14.8383254Z * [new branch] gh/etaf/138/orig -> origin/gh/etaf/138/orig 2025-08-14T21:20:14.8385779Z * [new branch] gh/etaf/140/base -> origin/gh/etaf/140/base 2025-08-14T21:20:14.8387586Z * [new branch] gh/etaf/140/head -> origin/gh/etaf/140/head 2025-08-14T21:20:14.8389403Z * [new branch] gh/etaf/140/orig -> origin/gh/etaf/140/orig 2025-08-14T21:20:14.8392512Z * [new branch] gh/etaf/143/base -> origin/gh/etaf/143/base 2025-08-14T21:20:14.8393882Z * [new branch] gh/etaf/143/head -> origin/gh/etaf/143/head 2025-08-14T21:20:14.8395634Z * [new branch] gh/etaf/143/orig -> origin/gh/etaf/143/orig 2025-08-14T21:20:14.8398139Z * [new branch] gh/etaf/147/base -> origin/gh/etaf/147/base 2025-08-14T21:20:14.8399878Z * [new branch] gh/etaf/147/head -> origin/gh/etaf/147/head 2025-08-14T21:20:14.8402376Z * [new branch] gh/etaf/148/base -> origin/gh/etaf/148/base 2025-08-14T21:20:14.8404296Z * [new branch] gh/etaf/148/head -> origin/gh/etaf/148/head 2025-08-14T21:20:14.8406137Z * [new branch] gh/etaf/148/orig -> origin/gh/etaf/148/orig 2025-08-14T21:20:14.8408624Z * [new branch] gh/etaf/149/base -> origin/gh/etaf/149/base 2025-08-14T21:20:14.8410459Z * [new branch] gh/etaf/149/head -> origin/gh/etaf/149/head 2025-08-14T21:20:14.8412239Z * [new branch] gh/etaf/149/orig -> origin/gh/etaf/149/orig 2025-08-14T21:20:14.8414848Z * [new branch] gh/etaf/150/base -> origin/gh/etaf/150/base 2025-08-14T21:20:14.8416721Z * [new branch] gh/etaf/150/head -> origin/gh/etaf/150/head 2025-08-14T21:20:14.8419221Z * [new branch] gh/etaf/150/orig -> origin/gh/etaf/150/orig 2025-08-14T21:20:14.8421754Z * [new branch] gh/etaf/151/base -> origin/gh/etaf/151/base 2025-08-14T21:20:14.8423631Z * [new branch] gh/etaf/151/head -> origin/gh/etaf/151/head 2025-08-14T21:20:14.8425485Z * [new branch] gh/etaf/151/orig -> origin/gh/etaf/151/orig 2025-08-14T21:20:14.8428049Z * [new branch] gh/etaf/152/base -> origin/gh/etaf/152/base 2025-08-14T21:20:14.8430392Z * [new branch] gh/etaf/152/head -> origin/gh/etaf/152/head 2025-08-14T21:20:14.8432232Z * [new branch] gh/etaf/152/orig -> origin/gh/etaf/152/orig 2025-08-14T21:20:14.8434887Z * [new branch] gh/etaf/153/base -> origin/gh/etaf/153/base 2025-08-14T21:20:14.8436703Z * [new branch] gh/etaf/153/head -> origin/gh/etaf/153/head 2025-08-14T21:20:14.8438411Z * [new branch] gh/etaf/153/orig -> origin/gh/etaf/153/orig 2025-08-14T21:20:14.8440802Z * [new branch] gh/etaf/154/base -> origin/gh/etaf/154/base 2025-08-14T21:20:14.8442610Z * [new branch] gh/etaf/154/head -> origin/gh/etaf/154/head 2025-08-14T21:20:14.8444583Z * [new branch] gh/etaf/154/orig -> origin/gh/etaf/154/orig 2025-08-14T21:20:14.8447667Z * [new branch] gh/etaf/155/base -> origin/gh/etaf/155/base 2025-08-14T21:20:14.8449433Z * [new branch] gh/etaf/155/head -> origin/gh/etaf/155/head 2025-08-14T21:20:14.8451089Z * [new branch] gh/etaf/155/orig -> origin/gh/etaf/155/orig 2025-08-14T21:20:14.8454278Z * [new branch] gh/ezyang/2374/base -> origin/gh/ezyang/2374/base 2025-08-14T21:20:14.8456120Z * [new branch] gh/ezyang/2374/head -> origin/gh/ezyang/2374/head 2025-08-14T21:20:14.8457906Z * [new branch] gh/ezyang/2374/orig -> origin/gh/ezyang/2374/orig 2025-08-14T21:20:14.8460370Z * [new branch] gh/ezyang/2973/base -> origin/gh/ezyang/2973/base 2025-08-14T21:20:14.8462143Z * [new branch] gh/ezyang/2973/head -> origin/gh/ezyang/2973/head 2025-08-14T21:20:14.8463931Z * [new branch] gh/ezyang/2973/orig -> origin/gh/ezyang/2973/orig 2025-08-14T21:20:14.8466422Z * [new branch] gh/ezyang/2974/base -> origin/gh/ezyang/2974/base 2025-08-14T21:20:14.8468269Z * [new branch] gh/ezyang/2974/head -> origin/gh/ezyang/2974/head 2025-08-14T21:20:14.8470117Z * [new branch] gh/ezyang/2974/orig -> origin/gh/ezyang/2974/orig 2025-08-14T21:20:14.8472601Z * [new branch] gh/ezyang/3068/base -> origin/gh/ezyang/3068/base 2025-08-14T21:20:14.8474346Z * [new branch] gh/ezyang/3068/head -> origin/gh/ezyang/3068/head 2025-08-14T21:20:14.8476173Z * [new branch] gh/ezyang/3068/orig -> origin/gh/ezyang/3068/orig 2025-08-14T21:20:14.8478664Z * [new branch] gh/ezyang/3071/base -> origin/gh/ezyang/3071/base 2025-08-14T21:20:14.8480395Z * [new branch] gh/ezyang/3071/head -> origin/gh/ezyang/3071/head 2025-08-14T21:20:14.8482716Z * [new branch] gh/ezyang/3071/orig -> origin/gh/ezyang/3071/orig 2025-08-14T21:20:14.8485388Z * [new branch] gh/ezyang/3074/base -> origin/gh/ezyang/3074/base 2025-08-14T21:20:14.8487151Z * [new branch] gh/ezyang/3074/head -> origin/gh/ezyang/3074/head 2025-08-14T21:20:14.8489242Z * [new branch] gh/ezyang/3074/orig -> origin/gh/ezyang/3074/orig 2025-08-14T21:20:14.8491433Z * [new branch] gh/ezyang/3088/base -> origin/gh/ezyang/3088/base 2025-08-14T21:20:14.8493374Z * [new branch] gh/ezyang/3088/head -> origin/gh/ezyang/3088/head 2025-08-14T21:20:14.8495015Z * [new branch] gh/ezyang/3088/orig -> origin/gh/ezyang/3088/orig 2025-08-14T21:20:14.8497498Z * [new branch] gh/ezyang/3092/base -> origin/gh/ezyang/3092/base 2025-08-14T21:20:14.8499388Z * [new branch] gh/ezyang/3092/head -> origin/gh/ezyang/3092/head 2025-08-14T21:20:14.8501101Z * [new branch] gh/ezyang/3092/orig -> origin/gh/ezyang/3092/orig 2025-08-14T21:20:14.8503619Z * [new branch] gh/ezyang/3097/base -> origin/gh/ezyang/3097/base 2025-08-14T21:20:14.8505433Z * [new branch] gh/ezyang/3097/head -> origin/gh/ezyang/3097/head 2025-08-14T21:20:14.8507297Z * [new branch] gh/ezyang/3097/orig -> origin/gh/ezyang/3097/orig 2025-08-14T21:20:14.8509847Z * [new branch] gh/ezyang/3098/base -> origin/gh/ezyang/3098/base 2025-08-14T21:20:14.8511979Z * [new branch] gh/ezyang/3098/head -> origin/gh/ezyang/3098/head 2025-08-14T21:20:14.8513546Z * [new branch] gh/ezyang/3098/orig -> origin/gh/ezyang/3098/orig 2025-08-14T21:20:14.8515836Z * [new branch] gh/ezyang/3099/base -> origin/gh/ezyang/3099/base 2025-08-14T21:20:14.8517732Z * [new branch] gh/ezyang/3099/head -> origin/gh/ezyang/3099/head 2025-08-14T21:20:14.8519588Z * [new branch] gh/ezyang/3099/orig -> origin/gh/ezyang/3099/orig 2025-08-14T21:20:14.8522075Z * [new branch] gh/ezyang/3100/base -> origin/gh/ezyang/3100/base 2025-08-14T21:20:14.8523825Z * [new branch] gh/ezyang/3100/head -> origin/gh/ezyang/3100/head 2025-08-14T21:20:14.8525729Z * [new branch] gh/ezyang/3100/orig -> origin/gh/ezyang/3100/orig 2025-08-14T21:20:14.8528267Z * [new branch] gh/ezyang/3101/base -> origin/gh/ezyang/3101/base 2025-08-14T21:20:14.8529991Z * [new branch] gh/ezyang/3101/head -> origin/gh/ezyang/3101/head 2025-08-14T21:20:14.8531868Z * [new branch] gh/ezyang/3101/orig -> origin/gh/ezyang/3101/orig 2025-08-14T21:20:14.8534425Z * [new branch] gh/ezyang/3102/base -> origin/gh/ezyang/3102/base 2025-08-14T21:20:14.8536240Z * [new branch] gh/ezyang/3102/head -> origin/gh/ezyang/3102/head 2025-08-14T21:20:14.8538006Z * [new branch] gh/ezyang/3102/orig -> origin/gh/ezyang/3102/orig 2025-08-14T21:20:14.8540451Z * [new branch] gh/ezyang/3103/base -> origin/gh/ezyang/3103/base 2025-08-14T21:20:14.8542336Z * [new branch] gh/ezyang/3103/head -> origin/gh/ezyang/3103/head 2025-08-14T21:20:14.8544085Z * [new branch] gh/ezyang/3103/orig -> origin/gh/ezyang/3103/orig 2025-08-14T21:20:14.8547367Z * [new branch] gh/ezyang/3104/base -> origin/gh/ezyang/3104/base 2025-08-14T21:20:14.8550482Z * [new branch] gh/ezyang/3104/head -> origin/gh/ezyang/3104/head 2025-08-14T21:20:14.8552266Z * [new branch] gh/ezyang/3104/orig -> origin/gh/ezyang/3104/orig 2025-08-14T21:20:14.8554833Z * [new branch] gh/ezyang/3105/base -> origin/gh/ezyang/3105/base 2025-08-14T21:20:14.8556612Z * [new branch] gh/ezyang/3105/head -> origin/gh/ezyang/3105/head 2025-08-14T21:20:14.8558381Z * [new branch] gh/ezyang/3105/orig -> origin/gh/ezyang/3105/orig 2025-08-14T21:20:14.8562053Z * [new branch] gh/ezyang/3106/base -> origin/gh/ezyang/3106/base 2025-08-14T21:20:14.8563809Z * [new branch] gh/ezyang/3106/head -> origin/gh/ezyang/3106/head 2025-08-14T21:20:14.8565884Z * [new branch] gh/ezyang/3106/orig -> origin/gh/ezyang/3106/orig 2025-08-14T21:20:14.8568471Z * [new branch] gh/ezyang/3107/base -> origin/gh/ezyang/3107/base 2025-08-14T21:20:14.8570273Z * [new branch] gh/ezyang/3107/head -> origin/gh/ezyang/3107/head 2025-08-14T21:20:14.8572008Z * [new branch] gh/ezyang/3107/orig -> origin/gh/ezyang/3107/orig 2025-08-14T21:20:14.8574513Z * [new branch] gh/ezyang/3108/base -> origin/gh/ezyang/3108/base 2025-08-14T21:20:14.8576301Z * [new branch] gh/ezyang/3108/head -> origin/gh/ezyang/3108/head 2025-08-14T21:20:14.8578068Z * [new branch] gh/ezyang/3108/orig -> origin/gh/ezyang/3108/orig 2025-08-14T21:20:14.8580621Z * [new branch] gh/ezyang/3109/base -> origin/gh/ezyang/3109/base 2025-08-14T21:20:14.8582407Z * [new branch] gh/ezyang/3109/head -> origin/gh/ezyang/3109/head 2025-08-14T21:20:14.8584561Z * [new branch] gh/ezyang/3109/orig -> origin/gh/ezyang/3109/orig 2025-08-14T21:20:14.8587263Z * [new branch] gh/ezyang/3110/base -> origin/gh/ezyang/3110/base 2025-08-14T21:20:14.8588969Z * [new branch] gh/ezyang/3110/head -> origin/gh/ezyang/3110/head 2025-08-14T21:20:14.8590871Z * [new branch] gh/ezyang/3110/orig -> origin/gh/ezyang/3110/orig 2025-08-14T21:20:14.8593406Z * [new branch] gh/ezyang/3111/base -> origin/gh/ezyang/3111/base 2025-08-14T21:20:14.8595282Z * [new branch] gh/ezyang/3111/head -> origin/gh/ezyang/3111/head 2025-08-14T21:20:14.8597065Z * [new branch] gh/ezyang/3111/orig -> origin/gh/ezyang/3111/orig 2025-08-14T21:20:14.8600077Z * [new branch] gh/ezyang/3112/base -> origin/gh/ezyang/3112/base 2025-08-14T21:20:14.8601905Z * [new branch] gh/ezyang/3112/head -> origin/gh/ezyang/3112/head 2025-08-14T21:20:14.8603706Z * [new branch] gh/ezyang/3112/orig -> origin/gh/ezyang/3112/orig 2025-08-14T21:20:14.8606366Z * [new branch] gh/ezyang/3113/base -> origin/gh/ezyang/3113/base 2025-08-14T21:20:14.8608121Z * [new branch] gh/ezyang/3113/head -> origin/gh/ezyang/3113/head 2025-08-14T21:20:14.8609891Z * [new branch] gh/ezyang/3113/orig -> origin/gh/ezyang/3113/orig 2025-08-14T21:20:14.8612422Z * [new branch] gh/ezyang/3114/base -> origin/gh/ezyang/3114/base 2025-08-14T21:20:14.8614290Z * [new branch] gh/ezyang/3114/head -> origin/gh/ezyang/3114/head 2025-08-14T21:20:14.8616069Z * [new branch] gh/ezyang/3114/orig -> origin/gh/ezyang/3114/orig 2025-08-14T21:20:14.8618600Z * [new branch] gh/ezyang/3115/base -> origin/gh/ezyang/3115/base 2025-08-14T21:20:14.8620488Z * [new branch] gh/ezyang/3115/head -> origin/gh/ezyang/3115/head 2025-08-14T21:20:14.8622370Z * [new branch] gh/ezyang/3115/orig -> origin/gh/ezyang/3115/orig 2025-08-14T21:20:14.8624964Z * [new branch] gh/ezyang/3116/base -> origin/gh/ezyang/3116/base 2025-08-14T21:20:14.8626664Z * [new branch] gh/ezyang/3116/head -> origin/gh/ezyang/3116/head 2025-08-14T21:20:14.8628461Z * [new branch] gh/ezyang/3116/orig -> origin/gh/ezyang/3116/orig 2025-08-14T21:20:14.8631459Z * [new branch] gh/ezyang/3117/base -> origin/gh/ezyang/3117/base 2025-08-14T21:20:14.8633207Z * [new branch] gh/ezyang/3117/head -> origin/gh/ezyang/3117/head 2025-08-14T21:20:14.8635045Z * [new branch] gh/ezyang/3117/orig -> origin/gh/ezyang/3117/orig 2025-08-14T21:20:14.8637442Z * [new branch] gh/ezyang/3118/base -> origin/gh/ezyang/3118/base 2025-08-14T21:20:14.8639247Z * [new branch] gh/ezyang/3118/head -> origin/gh/ezyang/3118/head 2025-08-14T21:20:14.8641047Z * [new branch] gh/ezyang/3118/orig -> origin/gh/ezyang/3118/orig 2025-08-14T21:20:14.8643558Z * [new branch] gh/ezyang/3119/base -> origin/gh/ezyang/3119/base 2025-08-14T21:20:14.8645616Z * [new branch] gh/ezyang/3119/head -> origin/gh/ezyang/3119/head 2025-08-14T21:20:14.8647640Z * [new branch] gh/ezyang/3119/orig -> origin/gh/ezyang/3119/orig 2025-08-14T21:20:14.8650185Z * [new branch] gh/ezyang/3120/base -> origin/gh/ezyang/3120/base 2025-08-14T21:20:14.8651814Z * [new branch] gh/ezyang/3120/head -> origin/gh/ezyang/3120/head 2025-08-14T21:20:14.8653535Z * [new branch] gh/ezyang/3120/orig -> origin/gh/ezyang/3120/orig 2025-08-14T21:20:14.8656049Z * [new branch] gh/ezyang/3121/base -> origin/gh/ezyang/3121/base 2025-08-14T21:20:14.8658163Z * [new branch] gh/ezyang/3121/head -> origin/gh/ezyang/3121/head 2025-08-14T21:20:14.8659821Z * [new branch] gh/ezyang/3121/orig -> origin/gh/ezyang/3121/orig 2025-08-14T21:20:14.8662132Z * [new branch] gh/ezyang/3122/base -> origin/gh/ezyang/3122/base 2025-08-14T21:20:14.8663897Z * [new branch] gh/ezyang/3122/head -> origin/gh/ezyang/3122/head 2025-08-14T21:20:14.8665746Z * [new branch] gh/ezyang/3122/orig -> origin/gh/ezyang/3122/orig 2025-08-14T21:20:14.8668300Z * [new branch] gh/ezyang/3123/base -> origin/gh/ezyang/3123/base 2025-08-14T21:20:14.8670262Z * [new branch] gh/ezyang/3123/head -> origin/gh/ezyang/3123/head 2025-08-14T21:20:14.8672093Z * [new branch] gh/ezyang/3123/orig -> origin/gh/ezyang/3123/orig 2025-08-14T21:20:14.8674770Z * [new branch] gh/ezyang/3124/base -> origin/gh/ezyang/3124/base 2025-08-14T21:20:14.8676396Z * [new branch] gh/ezyang/3124/head -> origin/gh/ezyang/3124/head 2025-08-14T21:20:14.8678155Z * [new branch] gh/ezyang/3124/orig -> origin/gh/ezyang/3124/orig 2025-08-14T21:20:14.8680944Z * [new branch] gh/ezyang/3125/base -> origin/gh/ezyang/3125/base 2025-08-14T21:20:14.8682440Z * [new branch] gh/ezyang/3125/head -> origin/gh/ezyang/3125/head 2025-08-14T21:20:14.8684162Z * [new branch] gh/ezyang/3125/orig -> origin/gh/ezyang/3125/orig 2025-08-14T21:20:14.8686895Z * [new branch] gh/ezyang/3126/base -> origin/gh/ezyang/3126/base 2025-08-14T21:20:14.8688708Z * [new branch] gh/ezyang/3126/head -> origin/gh/ezyang/3126/head 2025-08-14T21:20:14.8690530Z * [new branch] gh/ezyang/3126/orig -> origin/gh/ezyang/3126/orig 2025-08-14T21:20:14.8693354Z * [new branch] gh/ezyang/3127/base -> origin/gh/ezyang/3127/base 2025-08-14T21:20:14.8695212Z * [new branch] gh/ezyang/3127/head -> origin/gh/ezyang/3127/head 2025-08-14T21:20:14.8698075Z * [new branch] gh/ezyang/3127/orig -> origin/gh/ezyang/3127/orig 2025-08-14T21:20:14.8699561Z * [new branch] gh/ezyang/3128/base -> origin/gh/ezyang/3128/base 2025-08-14T21:20:14.8701252Z * [new branch] gh/ezyang/3128/head -> origin/gh/ezyang/3128/head 2025-08-14T21:20:14.8703281Z * [new branch] gh/ezyang/3128/orig -> origin/gh/ezyang/3128/orig 2025-08-14T21:20:14.8705628Z * [new branch] gh/ezyang/3129/base -> origin/gh/ezyang/3129/base 2025-08-14T21:20:14.8707422Z * [new branch] gh/ezyang/3129/head -> origin/gh/ezyang/3129/head 2025-08-14T21:20:14.8709222Z * [new branch] gh/ezyang/3129/orig -> origin/gh/ezyang/3129/orig 2025-08-14T21:20:14.8711688Z * [new branch] gh/ezyang/3130/base -> origin/gh/ezyang/3130/base 2025-08-14T21:20:14.8713432Z * [new branch] gh/ezyang/3130/head -> origin/gh/ezyang/3130/head 2025-08-14T21:20:14.8715216Z * [new branch] gh/ezyang/3130/orig -> origin/gh/ezyang/3130/orig 2025-08-14T21:20:14.8717747Z * [new branch] gh/ezyang/3131/base -> origin/gh/ezyang/3131/base 2025-08-14T21:20:14.8719577Z * [new branch] gh/ezyang/3131/head -> origin/gh/ezyang/3131/head 2025-08-14T21:20:14.8721367Z * [new branch] gh/ezyang/3131/orig -> origin/gh/ezyang/3131/orig 2025-08-14T21:20:14.8723853Z * [new branch] gh/ezyang/3132/base -> origin/gh/ezyang/3132/base 2025-08-14T21:20:14.8725721Z * [new branch] gh/ezyang/3132/head -> origin/gh/ezyang/3132/head 2025-08-14T21:20:14.8727473Z * [new branch] gh/ezyang/3132/orig -> origin/gh/ezyang/3132/orig 2025-08-14T21:20:14.8730002Z * [new branch] gh/ezyang/3133/base -> origin/gh/ezyang/3133/base 2025-08-14T21:20:14.8731757Z * [new branch] gh/ezyang/3133/head -> origin/gh/ezyang/3133/head 2025-08-14T21:20:14.8733631Z * [new branch] gh/ezyang/3133/orig -> origin/gh/ezyang/3133/orig 2025-08-14T21:20:14.8736118Z * [new branch] gh/ezyang/3134/base -> origin/gh/ezyang/3134/base 2025-08-14T21:20:14.8739512Z * [new branch] gh/ezyang/3134/head -> origin/gh/ezyang/3134/head 2025-08-14T21:20:14.8740891Z * [new branch] gh/ezyang/3134/orig -> origin/gh/ezyang/3134/orig 2025-08-14T21:20:14.8741750Z * [new branch] gh/ezyang/3135/base -> origin/gh/ezyang/3135/base 2025-08-14T21:20:14.8743941Z * [new branch] gh/ezyang/3135/head -> origin/gh/ezyang/3135/head 2025-08-14T21:20:14.8745741Z * [new branch] gh/ezyang/3135/orig -> origin/gh/ezyang/3135/orig 2025-08-14T21:20:14.8748463Z * [new branch] gh/ezyang/3136/base -> origin/gh/ezyang/3136/base 2025-08-14T21:20:14.8750219Z * [new branch] gh/ezyang/3136/head -> origin/gh/ezyang/3136/head 2025-08-14T21:20:14.8752525Z * [new branch] gh/ezyang/3136/orig -> origin/gh/ezyang/3136/orig 2025-08-14T21:20:14.8755853Z * [new branch] gh/fadara01/1/base -> origin/gh/fadara01/1/base 2025-08-14T21:20:14.8757377Z * [new branch] gh/fadara01/1/head -> origin/gh/fadara01/1/head 2025-08-14T21:20:14.8759102Z * [new branch] gh/fadara01/1/orig -> origin/gh/fadara01/1/orig 2025-08-14T21:20:14.8762246Z * [new branch] gh/fduwjj/168/base -> origin/gh/fduwjj/168/base 2025-08-14T21:20:14.8764105Z * [new branch] gh/fduwjj/168/head -> origin/gh/fduwjj/168/head 2025-08-14T21:20:14.8766040Z * [new branch] gh/fduwjj/168/orig -> origin/gh/fduwjj/168/orig 2025-08-14T21:20:14.8768768Z * [new branch] gh/fduwjj/169/base -> origin/gh/fduwjj/169/base 2025-08-14T21:20:14.8770611Z * [new branch] gh/fduwjj/169/head -> origin/gh/fduwjj/169/head 2025-08-14T21:20:14.8772416Z * [new branch] gh/fduwjj/169/orig -> origin/gh/fduwjj/169/orig 2025-08-14T21:20:14.8775163Z * [new branch] gh/fduwjj/170/base -> origin/gh/fduwjj/170/base 2025-08-14T21:20:14.8776981Z * [new branch] gh/fduwjj/170/head -> origin/gh/fduwjj/170/head 2025-08-14T21:20:14.8779349Z * [new branch] gh/fduwjj/170/orig -> origin/gh/fduwjj/170/orig 2025-08-14T21:20:14.8782092Z * [new branch] gh/fduwjj/171/base -> origin/gh/fduwjj/171/base 2025-08-14T21:20:14.8784178Z * [new branch] gh/fduwjj/171/head -> origin/gh/fduwjj/171/head 2025-08-14T21:20:14.8785772Z * [new branch] gh/fduwjj/171/orig -> origin/gh/fduwjj/171/orig 2025-08-14T21:20:14.8788514Z * [new branch] gh/fduwjj/172/base -> origin/gh/fduwjj/172/base 2025-08-14T21:20:14.8790097Z * [new branch] gh/fduwjj/172/head -> origin/gh/fduwjj/172/head 2025-08-14T21:20:14.8791845Z * [new branch] gh/fduwjj/172/orig -> origin/gh/fduwjj/172/orig 2025-08-14T21:20:14.8794219Z * [new branch] gh/fduwjj/173/base -> origin/gh/fduwjj/173/base 2025-08-14T21:20:14.8796093Z * [new branch] gh/fduwjj/173/head -> origin/gh/fduwjj/173/head 2025-08-14T21:20:14.8797843Z * [new branch] gh/fduwjj/173/orig -> origin/gh/fduwjj/173/orig 2025-08-14T21:20:14.8800213Z * [new branch] gh/fduwjj/174/base -> origin/gh/fduwjj/174/base 2025-08-14T21:20:14.8802035Z * [new branch] gh/fduwjj/174/head -> origin/gh/fduwjj/174/head 2025-08-14T21:20:14.8803848Z * [new branch] gh/fduwjj/174/orig -> origin/gh/fduwjj/174/orig 2025-08-14T21:20:14.8806636Z * [new branch] gh/fduwjj/175/base -> origin/gh/fduwjj/175/base 2025-08-14T21:20:14.8808760Z * [new branch] gh/fduwjj/175/head -> origin/gh/fduwjj/175/head 2025-08-14T21:20:14.8810422Z * [new branch] gh/fduwjj/175/orig -> origin/gh/fduwjj/175/orig 2025-08-14T21:20:14.8812862Z * [new branch] gh/fduwjj/176/base -> origin/gh/fduwjj/176/base 2025-08-14T21:20:14.8814659Z * [new branch] gh/fduwjj/176/head -> origin/gh/fduwjj/176/head 2025-08-14T21:20:14.8816510Z * [new branch] gh/fduwjj/176/orig -> origin/gh/fduwjj/176/orig 2025-08-14T21:20:14.8818988Z * [new branch] gh/fduwjj/177/base -> origin/gh/fduwjj/177/base 2025-08-14T21:20:14.8820784Z * [new branch] gh/fduwjj/177/head -> origin/gh/fduwjj/177/head 2025-08-14T21:20:14.8822569Z * [new branch] gh/fduwjj/177/orig -> origin/gh/fduwjj/177/orig 2025-08-14T21:20:14.8825106Z * [new branch] gh/fduwjj/178/base -> origin/gh/fduwjj/178/base 2025-08-14T21:20:14.8827006Z * [new branch] gh/fduwjj/178/head -> origin/gh/fduwjj/178/head 2025-08-14T21:20:14.8828780Z * [new branch] gh/fduwjj/178/orig -> origin/gh/fduwjj/178/orig 2025-08-14T21:20:14.8831795Z * [new branch] gh/fduwjj/179/base -> origin/gh/fduwjj/179/base 2025-08-14T21:20:14.8832923Z * [new branch] gh/fduwjj/179/head -> origin/gh/fduwjj/179/head 2025-08-14T21:20:14.8834801Z * [new branch] gh/fduwjj/179/orig -> origin/gh/fduwjj/179/orig 2025-08-14T21:20:14.8837274Z * [new branch] gh/fduwjj/180/base -> origin/gh/fduwjj/180/base 2025-08-14T21:20:14.8839068Z * [new branch] gh/fduwjj/180/head -> origin/gh/fduwjj/180/head 2025-08-14T21:20:14.8840835Z * [new branch] gh/fduwjj/180/orig -> origin/gh/fduwjj/180/orig 2025-08-14T21:20:14.8843418Z * [new branch] gh/fduwjj/181/base -> origin/gh/fduwjj/181/base 2025-08-14T21:20:14.8845433Z * [new branch] gh/fduwjj/181/head -> origin/gh/fduwjj/181/head 2025-08-14T21:20:14.8847786Z * [new branch] gh/fduwjj/181/orig -> origin/gh/fduwjj/181/orig 2025-08-14T21:20:14.8850448Z * [new branch] gh/fegin/306/base -> origin/gh/fegin/306/base 2025-08-14T21:20:14.8852208Z * [new branch] gh/fegin/306/head -> origin/gh/fegin/306/head 2025-08-14T21:20:14.8853928Z * [new branch] gh/fegin/306/orig -> origin/gh/fegin/306/orig 2025-08-14T21:20:14.8856386Z * [new branch] gh/fegin/307/base -> origin/gh/fegin/307/base 2025-08-14T21:20:14.8858180Z * [new branch] gh/fegin/307/head -> origin/gh/fegin/307/head 2025-08-14T21:20:14.8860056Z * [new branch] gh/fegin/307/orig -> origin/gh/fegin/307/orig 2025-08-14T21:20:14.8862953Z * [new branch] gh/fffrog/114/base -> origin/gh/fffrog/114/base 2025-08-14T21:20:14.8864853Z * [new branch] gh/fffrog/114/head -> origin/gh/fffrog/114/head 2025-08-14T21:20:14.8866789Z * [new branch] gh/fffrog/114/orig -> origin/gh/fffrog/114/orig 2025-08-14T21:20:14.8869284Z * [new branch] gh/fffrog/117/base -> origin/gh/fffrog/117/base 2025-08-14T21:20:14.8871182Z * [new branch] gh/fffrog/117/head -> origin/gh/fffrog/117/head 2025-08-14T21:20:14.8873279Z * [new branch] gh/fffrog/117/orig -> origin/gh/fffrog/117/orig 2025-08-14T21:20:14.8875464Z * [new branch] gh/fffrog/119/base -> origin/gh/fffrog/119/base 2025-08-14T21:20:14.8877357Z * [new branch] gh/fffrog/119/head -> origin/gh/fffrog/119/head 2025-08-14T21:20:14.8878982Z * [new branch] gh/fffrog/119/orig -> origin/gh/fffrog/119/orig 2025-08-14T21:20:14.8881393Z * [new branch] gh/fffrog/120/base -> origin/gh/fffrog/120/base 2025-08-14T21:20:14.8883322Z * [new branch] gh/fffrog/120/head -> origin/gh/fffrog/120/head 2025-08-14T21:20:14.8885163Z * [new branch] gh/fffrog/120/orig -> origin/gh/fffrog/120/orig 2025-08-14T21:20:14.8887979Z * [new branch] gh/fffrog/121/base -> origin/gh/fffrog/121/base 2025-08-14T21:20:14.8889689Z * [new branch] gh/fffrog/121/head -> origin/gh/fffrog/121/head 2025-08-14T21:20:14.8891999Z * [new branch] gh/fffrog/121/orig -> origin/gh/fffrog/121/orig 2025-08-14T21:20:14.8894687Z * [new branch] gh/fffrog/122/base -> origin/gh/fffrog/122/base 2025-08-14T21:20:14.8896418Z * [new branch] gh/fffrog/122/head -> origin/gh/fffrog/122/head 2025-08-14T21:20:14.8898193Z * [new branch] gh/fffrog/122/orig -> origin/gh/fffrog/122/orig 2025-08-14T21:20:14.8900690Z * [new branch] gh/fffrog/123/base -> origin/gh/fffrog/123/base 2025-08-14T21:20:14.8902218Z * [new branch] gh/fffrog/123/head -> origin/gh/fffrog/123/head 2025-08-14T21:20:14.8904255Z * [new branch] gh/fffrog/123/orig -> origin/gh/fffrog/123/orig 2025-08-14T21:20:14.8906746Z * [new branch] gh/fffrog/124/base -> origin/gh/fffrog/124/base 2025-08-14T21:20:14.8908612Z * [new branch] gh/fffrog/124/head -> origin/gh/fffrog/124/head 2025-08-14T21:20:14.8910302Z * [new branch] gh/fffrog/124/orig -> origin/gh/fffrog/124/orig 2025-08-14T21:20:14.8912890Z * [new branch] gh/fffrog/125/base -> origin/gh/fffrog/125/base 2025-08-14T21:20:14.8914584Z * [new branch] gh/fffrog/125/head -> origin/gh/fffrog/125/head 2025-08-14T21:20:14.8916352Z * [new branch] gh/fffrog/125/orig -> origin/gh/fffrog/125/orig 2025-08-14T21:20:14.8919192Z * [new branch] gh/fffrog/126/base -> origin/gh/fffrog/126/base 2025-08-14T21:20:14.8921548Z * [new branch] gh/fffrog/126/head -> origin/gh/fffrog/126/head 2025-08-14T21:20:14.8923312Z * [new branch] gh/fffrog/126/orig -> origin/gh/fffrog/126/orig 2025-08-14T21:20:14.8926140Z * [new branch] gh/fffrog/127/base -> origin/gh/fffrog/127/base 2025-08-14T21:20:14.8927938Z * [new branch] gh/fffrog/127/head -> origin/gh/fffrog/127/head 2025-08-14T21:20:14.8929783Z * [new branch] gh/fffrog/127/orig -> origin/gh/fffrog/127/orig 2025-08-14T21:20:14.8932214Z * [new branch] gh/fffrog/128/base -> origin/gh/fffrog/128/base 2025-08-14T21:20:14.8933997Z * [new branch] gh/fffrog/128/head -> origin/gh/fffrog/128/head 2025-08-14T21:20:14.8935613Z * [new branch] gh/fffrog/128/orig -> origin/gh/fffrog/128/orig 2025-08-14T21:20:14.8938108Z * [new branch] gh/fffrog/129/base -> origin/gh/fffrog/129/base 2025-08-14T21:20:14.8940011Z * [new branch] gh/fffrog/129/head -> origin/gh/fffrog/129/head 2025-08-14T21:20:14.8941775Z * [new branch] gh/fffrog/129/orig -> origin/gh/fffrog/129/orig 2025-08-14T21:20:14.8944256Z * [new branch] gh/fffrog/130/base -> origin/gh/fffrog/130/base 2025-08-14T21:20:14.8945987Z * [new branch] gh/fffrog/130/head -> origin/gh/fffrog/130/head 2025-08-14T21:20:14.8948191Z * [new branch] gh/fffrog/130/orig -> origin/gh/fffrog/130/orig 2025-08-14T21:20:14.8951792Z * [new branch] gh/fffrog/131/base -> origin/gh/fffrog/131/base 2025-08-14T21:20:14.8953556Z * [new branch] gh/fffrog/131/head -> origin/gh/fffrog/131/head 2025-08-14T21:20:14.8955227Z * [new branch] gh/fffrog/131/orig -> origin/gh/fffrog/131/orig 2025-08-14T21:20:14.8957839Z * [new branch] gh/fffrog/132/base -> origin/gh/fffrog/132/base 2025-08-14T21:20:14.8959499Z * [new branch] gh/fffrog/132/head -> origin/gh/fffrog/132/head 2025-08-14T21:20:14.8961335Z * [new branch] gh/fffrog/132/orig -> origin/gh/fffrog/132/orig 2025-08-14T21:20:14.8964225Z * [new branch] gh/fffrog/133/base -> origin/gh/fffrog/133/base 2025-08-14T21:20:14.8966267Z * [new branch] gh/fffrog/133/head -> origin/gh/fffrog/133/head 2025-08-14T21:20:14.8968029Z * [new branch] gh/fffrog/133/orig -> origin/gh/fffrog/133/orig 2025-08-14T21:20:14.8970577Z * [new branch] gh/fffrog/134/base -> origin/gh/fffrog/134/base 2025-08-14T21:20:14.8972788Z * [new branch] gh/fffrog/134/head -> origin/gh/fffrog/134/head 2025-08-14T21:20:14.8974613Z * [new branch] gh/fffrog/134/orig -> origin/gh/fffrog/134/orig 2025-08-14T21:20:14.8977590Z * [new branch] gh/fffrog/135/base -> origin/gh/fffrog/135/base 2025-08-14T21:20:14.8979352Z * [new branch] gh/fffrog/135/head -> origin/gh/fffrog/135/head 2025-08-14T21:20:14.8981662Z * [new branch] gh/fffrog/135/orig -> origin/gh/fffrog/135/orig 2025-08-14T21:20:14.8984249Z * [new branch] gh/fffrog/136/base -> origin/gh/fffrog/136/base 2025-08-14T21:20:14.8986074Z * [new branch] gh/fffrog/136/head -> origin/gh/fffrog/136/head 2025-08-14T21:20:14.8987813Z * [new branch] gh/fffrog/136/orig -> origin/gh/fffrog/136/orig 2025-08-14T21:20:14.8990250Z * [new branch] gh/fffrog/137/base -> origin/gh/fffrog/137/base 2025-08-14T21:20:14.8992167Z * [new branch] gh/fffrog/137/head -> origin/gh/fffrog/137/head 2025-08-14T21:20:14.8993935Z * [new branch] gh/fffrog/137/orig -> origin/gh/fffrog/137/orig 2025-08-14T21:20:14.8996448Z * [new branch] gh/fffrog/138/base -> origin/gh/fffrog/138/base 2025-08-14T21:20:14.8998192Z * [new branch] gh/fffrog/138/head -> origin/gh/fffrog/138/head 2025-08-14T21:20:14.9000003Z * [new branch] gh/fffrog/138/orig -> origin/gh/fffrog/138/orig 2025-08-14T21:20:14.9003098Z * [new branch] gh/gmagogsfm/1/base -> origin/gh/gmagogsfm/1/base 2025-08-14T21:20:14.9005099Z * [new branch] gh/gmagogsfm/1/head -> origin/gh/gmagogsfm/1/head 2025-08-14T21:20:14.9006839Z * [new branch] gh/gmagogsfm/1/orig -> origin/gh/gmagogsfm/1/orig 2025-08-14T21:20:14.9009670Z * [new branch] gh/gmagogsfm/2/base -> origin/gh/gmagogsfm/2/base 2025-08-14T21:20:14.9010806Z * [new branch] gh/gmagogsfm/2/head -> origin/gh/gmagogsfm/2/head 2025-08-14T21:20:14.9012959Z * [new branch] gh/gmagogsfm/2/orig -> origin/gh/gmagogsfm/2/orig 2025-08-14T21:20:14.9015461Z * [new branch] gh/gmagogsfm/3/base -> origin/gh/gmagogsfm/3/base 2025-08-14T21:20:14.9017099Z * [new branch] gh/gmagogsfm/3/head -> origin/gh/gmagogsfm/3/head 2025-08-14T21:20:14.9018867Z * [new branch] gh/gmagogsfm/3/orig -> origin/gh/gmagogsfm/3/orig 2025-08-14T21:20:14.9021320Z * [new branch] gh/gmagogsfm/4/base -> origin/gh/gmagogsfm/4/base 2025-08-14T21:20:14.9023234Z * [new branch] gh/gmagogsfm/4/head -> origin/gh/gmagogsfm/4/head 2025-08-14T21:20:14.9024885Z * [new branch] gh/gmagogsfm/4/orig -> origin/gh/gmagogsfm/4/orig 2025-08-14T21:20:14.9027899Z * [new branch] gh/guangyey/130/base -> origin/gh/guangyey/130/base 2025-08-14T21:20:14.9029684Z * [new branch] gh/guangyey/130/head -> origin/gh/guangyey/130/head 2025-08-14T21:20:14.9031430Z * [new branch] gh/guangyey/130/orig -> origin/gh/guangyey/130/orig 2025-08-14T21:20:14.9034053Z * [new branch] gh/guangyey/133/base -> origin/gh/guangyey/133/base 2025-08-14T21:20:14.9035846Z * [new branch] gh/guangyey/133/head -> origin/gh/guangyey/133/head 2025-08-14T21:20:14.9037489Z * [new branch] gh/guangyey/133/orig -> origin/gh/guangyey/133/orig 2025-08-14T21:20:14.9039991Z * [new branch] gh/guangyey/134/base -> origin/gh/guangyey/134/base 2025-08-14T21:20:14.9041768Z * [new branch] gh/guangyey/134/head -> origin/gh/guangyey/134/head 2025-08-14T21:20:14.9043538Z * [new branch] gh/guangyey/134/orig -> origin/gh/guangyey/134/orig 2025-08-14T21:20:14.9046131Z * [new branch] gh/guangyey/135/base -> origin/gh/guangyey/135/base 2025-08-14T21:20:14.9048269Z * [new branch] gh/guangyey/135/head -> origin/gh/guangyey/135/head 2025-08-14T21:20:14.9050138Z * [new branch] gh/guangyey/135/orig -> origin/gh/guangyey/135/orig 2025-08-14T21:20:14.9052533Z * [new branch] gh/guangyey/139/base -> origin/gh/guangyey/139/base 2025-08-14T21:20:14.9054366Z * [new branch] gh/guangyey/139/head -> origin/gh/guangyey/139/head 2025-08-14T21:20:14.9056019Z * [new branch] gh/guangyey/139/orig -> origin/gh/guangyey/139/orig 2025-08-14T21:20:14.9058504Z * [new branch] gh/guangyey/140/base -> origin/gh/guangyey/140/base 2025-08-14T21:20:14.9060362Z * [new branch] gh/guangyey/140/head -> origin/gh/guangyey/140/head 2025-08-14T21:20:14.9062119Z * [new branch] gh/guangyey/140/orig -> origin/gh/guangyey/140/orig 2025-08-14T21:20:14.9064680Z * [new branch] gh/guangyey/142/base -> origin/gh/guangyey/142/base 2025-08-14T21:20:14.9066504Z * [new branch] gh/guangyey/142/head -> origin/gh/guangyey/142/head 2025-08-14T21:20:14.9068293Z * [new branch] gh/guangyey/142/orig -> origin/gh/guangyey/142/orig 2025-08-14T21:20:14.9070759Z * [new branch] gh/guangyey/145/base -> origin/gh/guangyey/145/base 2025-08-14T21:20:14.9072605Z * [new branch] gh/guangyey/145/head -> origin/gh/guangyey/145/head 2025-08-14T21:20:14.9074366Z * [new branch] gh/guangyey/145/orig -> origin/gh/guangyey/145/orig 2025-08-14T21:20:14.9076714Z * [new branch] gh/guangyey/153/base -> origin/gh/guangyey/153/base 2025-08-14T21:20:14.9078550Z * [new branch] gh/guangyey/153/head -> origin/gh/guangyey/153/head 2025-08-14T21:20:14.9080266Z * [new branch] gh/guangyey/153/orig -> origin/gh/guangyey/153/orig 2025-08-14T21:20:14.9082757Z * [new branch] gh/guangyey/158/base -> origin/gh/guangyey/158/base 2025-08-14T21:20:14.9084712Z * [new branch] gh/guangyey/158/head -> origin/gh/guangyey/158/head 2025-08-14T21:20:14.9086492Z * [new branch] gh/guangyey/158/orig -> origin/gh/guangyey/158/orig 2025-08-14T21:20:14.9089008Z * [new branch] gh/guangyey/159/base -> origin/gh/guangyey/159/base 2025-08-14T21:20:14.9090793Z * [new branch] gh/guangyey/159/head -> origin/gh/guangyey/159/head 2025-08-14T21:20:14.9092593Z * [new branch] gh/guangyey/159/orig -> origin/gh/guangyey/159/orig 2025-08-14T21:20:14.9095206Z * [new branch] gh/guangyey/163/base -> origin/gh/guangyey/163/base 2025-08-14T21:20:14.9096794Z * [new branch] gh/guangyey/163/head -> origin/gh/guangyey/163/head 2025-08-14T21:20:14.9098522Z * [new branch] gh/guangyey/163/orig -> origin/gh/guangyey/163/orig 2025-08-14T21:20:14.9100981Z * [new branch] gh/guangyey/165/base -> origin/gh/guangyey/165/base 2025-08-14T21:20:14.9102740Z * [new branch] gh/guangyey/165/head -> origin/gh/guangyey/165/head 2025-08-14T21:20:14.9104775Z * [new branch] gh/guangyey/165/orig -> origin/gh/guangyey/165/orig 2025-08-14T21:20:14.9106983Z * [new branch] gh/guangyey/168/base -> origin/gh/guangyey/168/base 2025-08-14T21:20:14.9108795Z * [new branch] gh/guangyey/168/head -> origin/gh/guangyey/168/head 2025-08-14T21:20:14.9110545Z * [new branch] gh/guangyey/168/orig -> origin/gh/guangyey/168/orig 2025-08-14T21:20:14.9113275Z * [new branch] gh/guangyey/169/base -> origin/gh/guangyey/169/base 2025-08-14T21:20:14.9115067Z * [new branch] gh/guangyey/169/head -> origin/gh/guangyey/169/head 2025-08-14T21:20:14.9116746Z * [new branch] gh/guangyey/169/orig -> origin/gh/guangyey/169/orig 2025-08-14T21:20:14.9119220Z * [new branch] gh/guangyey/170/base -> origin/gh/guangyey/170/base 2025-08-14T21:20:14.9121391Z * [new branch] gh/guangyey/170/head -> origin/gh/guangyey/170/head 2025-08-14T21:20:14.9122898Z * [new branch] gh/guangyey/170/orig -> origin/gh/guangyey/170/orig 2025-08-14T21:20:14.9126137Z * [new branch] gh/guangyey/171/base -> origin/gh/guangyey/171/base 2025-08-14T21:20:14.9127801Z * [new branch] gh/guangyey/171/head -> origin/gh/guangyey/171/head 2025-08-14T21:20:14.9129642Z * [new branch] gh/guangyey/171/orig -> origin/gh/guangyey/171/orig 2025-08-14T21:20:14.9132255Z * [new branch] gh/guangyey/172/base -> origin/gh/guangyey/172/base 2025-08-14T21:20:14.9133980Z * [new branch] gh/guangyey/172/head -> origin/gh/guangyey/172/head 2025-08-14T21:20:14.9136010Z * [new branch] gh/guangyey/172/orig -> origin/gh/guangyey/172/orig 2025-08-14T21:20:14.9138916Z * [new branch] gh/guangyey/173/base -> origin/gh/guangyey/173/base 2025-08-14T21:20:14.9140711Z * [new branch] gh/guangyey/173/head -> origin/gh/guangyey/173/head 2025-08-14T21:20:14.9142408Z * [new branch] gh/guangyey/173/orig -> origin/gh/guangyey/173/orig 2025-08-14T21:20:14.9144960Z * [new branch] gh/guangyey/174/base -> origin/gh/guangyey/174/base 2025-08-14T21:20:14.9146878Z * [new branch] gh/guangyey/174/head -> origin/gh/guangyey/174/head 2025-08-14T21:20:14.9148923Z * [new branch] gh/guangyey/174/orig -> origin/gh/guangyey/174/orig 2025-08-14T21:20:14.9151265Z * [new branch] gh/guangyey/175/base -> origin/gh/guangyey/175/base 2025-08-14T21:20:14.9153170Z * [new branch] gh/guangyey/175/head -> origin/gh/guangyey/175/head 2025-08-14T21:20:14.9160891Z * [new branch] gh/guangyey/175/orig -> origin/gh/guangyey/175/orig 2025-08-14T21:20:14.9161188Z * [new branch] gh/guangyey/176/base -> origin/gh/guangyey/176/base 2025-08-14T21:20:14.9161618Z * [new branch] gh/guangyey/176/head -> origin/gh/guangyey/176/head 2025-08-14T21:20:14.9161820Z * [new branch] gh/guangyey/176/orig -> origin/gh/guangyey/176/orig 2025-08-14T21:20:14.9164007Z * [new branch] gh/guangyey/177/base -> origin/gh/guangyey/177/base 2025-08-14T21:20:14.9165651Z * [new branch] gh/guangyey/177/head -> origin/gh/guangyey/177/head 2025-08-14T21:20:14.9167468Z * [new branch] gh/guangyey/177/orig -> origin/gh/guangyey/177/orig 2025-08-14T21:20:14.9169989Z * [new branch] gh/guangyey/178/base -> origin/gh/guangyey/178/base 2025-08-14T21:20:14.9171752Z * [new branch] gh/guangyey/178/head -> origin/gh/guangyey/178/head 2025-08-14T21:20:14.9173526Z * [new branch] gh/guangyey/178/orig -> origin/gh/guangyey/178/orig 2025-08-14T21:20:14.9176052Z * [new branch] gh/guangyey/179/base -> origin/gh/guangyey/179/base 2025-08-14T21:20:14.9177937Z * [new branch] gh/guangyey/179/head -> origin/gh/guangyey/179/head 2025-08-14T21:20:14.9179592Z * [new branch] gh/guangyey/179/orig -> origin/gh/guangyey/179/orig 2025-08-14T21:20:14.9182112Z * [new branch] gh/guangyey/180/base -> origin/gh/guangyey/180/base 2025-08-14T21:20:14.9183912Z * [new branch] gh/guangyey/180/head -> origin/gh/guangyey/180/head 2025-08-14T21:20:14.9186326Z * [new branch] gh/guangyey/180/orig -> origin/gh/guangyey/180/orig 2025-08-14T21:20:14.9188951Z * [new branch] gh/guangyey/181/base -> origin/gh/guangyey/181/base 2025-08-14T21:20:14.9190714Z * [new branch] gh/guangyey/181/head -> origin/gh/guangyey/181/head 2025-08-14T21:20:14.9192491Z * [new branch] gh/guangyey/181/orig -> origin/gh/guangyey/181/orig 2025-08-14T21:20:14.9195041Z * [new branch] gh/guangyey/182/base -> origin/gh/guangyey/182/base 2025-08-14T21:20:14.9196796Z * [new branch] gh/guangyey/182/head -> origin/gh/guangyey/182/head 2025-08-14T21:20:14.9198601Z * [new branch] gh/guangyey/182/orig -> origin/gh/guangyey/182/orig 2025-08-14T21:20:14.9201464Z * [new branch] gh/guangyey/183/base -> origin/gh/guangyey/183/base 2025-08-14T21:20:14.9203226Z * [new branch] gh/guangyey/183/head -> origin/gh/guangyey/183/head 2025-08-14T21:20:14.9205082Z * [new branch] gh/guangyey/183/orig -> origin/gh/guangyey/183/orig 2025-08-14T21:20:14.9207641Z * [new branch] gh/guangyey/184/base -> origin/gh/guangyey/184/base 2025-08-14T21:20:14.9209376Z * [new branch] gh/guangyey/184/head -> origin/gh/guangyey/184/head 2025-08-14T21:20:14.9211180Z * [new branch] gh/guangyey/184/orig -> origin/gh/guangyey/184/orig 2025-08-14T21:20:14.9213776Z * [new branch] gh/guangyey/185/base -> origin/gh/guangyey/185/base 2025-08-14T21:20:14.9215603Z * [new branch] gh/guangyey/185/head -> origin/gh/guangyey/185/head 2025-08-14T21:20:14.9217353Z * [new branch] gh/guangyey/185/orig -> origin/gh/guangyey/185/orig 2025-08-14T21:20:14.9219909Z * [new branch] gh/guangyey/79/base -> origin/gh/guangyey/79/base 2025-08-14T21:20:14.9221643Z * [new branch] gh/guangyey/79/head -> origin/gh/guangyey/79/head 2025-08-14T21:20:14.9223389Z * [new branch] gh/guangyey/79/orig -> origin/gh/guangyey/79/orig 2025-08-14T21:20:14.9225912Z * [new branch] gh/guangyey/89/base -> origin/gh/guangyey/89/base 2025-08-14T21:20:14.9227775Z * [new branch] gh/guangyey/89/head -> origin/gh/guangyey/89/head 2025-08-14T21:20:14.9230044Z * [new branch] gh/guangyey/89/orig -> origin/gh/guangyey/89/orig 2025-08-14T21:20:14.9232952Z * [new branch] gh/guilhermeleobas/107/base -> origin/gh/guilhermeleobas/107/base 2025-08-14T21:20:14.9234772Z * [new branch] gh/guilhermeleobas/107/head -> origin/gh/guilhermeleobas/107/head 2025-08-14T21:20:14.9236526Z * [new branch] gh/guilhermeleobas/107/orig -> origin/gh/guilhermeleobas/107/orig 2025-08-14T21:20:14.9238911Z * [new branch] gh/guilhermeleobas/108/base -> origin/gh/guilhermeleobas/108/base 2025-08-14T21:20:14.9240701Z * [new branch] gh/guilhermeleobas/108/head -> origin/gh/guilhermeleobas/108/head 2025-08-14T21:20:14.9242441Z * [new branch] gh/guilhermeleobas/108/orig -> origin/gh/guilhermeleobas/108/orig 2025-08-14T21:20:14.9245591Z * [new branch] gh/guilhermeleobas/124/base -> origin/gh/guilhermeleobas/124/base 2025-08-14T21:20:14.9247749Z * [new branch] gh/guilhermeleobas/124/head -> origin/gh/guilhermeleobas/124/head 2025-08-14T21:20:14.9249709Z * [new branch] gh/guilhermeleobas/124/orig -> origin/gh/guilhermeleobas/124/orig 2025-08-14T21:20:14.9251934Z * [new branch] gh/guilhermeleobas/147/base -> origin/gh/guilhermeleobas/147/base 2025-08-14T21:20:14.9253804Z * [new branch] gh/guilhermeleobas/147/head -> origin/gh/guilhermeleobas/147/head 2025-08-14T21:20:14.9255471Z * [new branch] gh/guilhermeleobas/147/orig -> origin/gh/guilhermeleobas/147/orig 2025-08-14T21:20:14.9257900Z * [new branch] gh/guilhermeleobas/150/base -> origin/gh/guilhermeleobas/150/base 2025-08-14T21:20:14.9259683Z * [new branch] gh/guilhermeleobas/150/head -> origin/gh/guilhermeleobas/150/head 2025-08-14T21:20:14.9261507Z * [new branch] gh/guilhermeleobas/150/orig -> origin/gh/guilhermeleobas/150/orig 2025-08-14T21:20:14.9264006Z * [new branch] gh/guilhermeleobas/163/base -> origin/gh/guilhermeleobas/163/base 2025-08-14T21:20:14.9265890Z * [new branch] gh/guilhermeleobas/163/head -> origin/gh/guilhermeleobas/163/head 2025-08-14T21:20:14.9267694Z * [new branch] gh/guilhermeleobas/163/orig -> origin/gh/guilhermeleobas/163/orig 2025-08-14T21:20:14.9270018Z * [new branch] gh/guilhermeleobas/164/base -> origin/gh/guilhermeleobas/164/base 2025-08-14T21:20:14.9271807Z * [new branch] gh/guilhermeleobas/164/head -> origin/gh/guilhermeleobas/164/head 2025-08-14T21:20:14.9273600Z * [new branch] gh/guilhermeleobas/164/orig -> origin/gh/guilhermeleobas/164/orig 2025-08-14T21:20:14.9276017Z * [new branch] gh/guilhermeleobas/165/base -> origin/gh/guilhermeleobas/165/base 2025-08-14T21:20:14.9277773Z * [new branch] gh/guilhermeleobas/165/head -> origin/gh/guilhermeleobas/165/head 2025-08-14T21:20:14.9279568Z * [new branch] gh/guilhermeleobas/165/orig -> origin/gh/guilhermeleobas/165/orig 2025-08-14T21:20:14.9282040Z * [new branch] gh/guilhermeleobas/166/base -> origin/gh/guilhermeleobas/166/base 2025-08-14T21:20:14.9283840Z * [new branch] gh/guilhermeleobas/166/head -> origin/gh/guilhermeleobas/166/head 2025-08-14T21:20:14.9285879Z * [new branch] gh/guilhermeleobas/166/orig -> origin/gh/guilhermeleobas/166/orig 2025-08-14T21:20:14.9288352Z * [new branch] gh/guilhermeleobas/167/base -> origin/gh/guilhermeleobas/167/base 2025-08-14T21:20:14.9290141Z * [new branch] gh/guilhermeleobas/167/head -> origin/gh/guilhermeleobas/167/head 2025-08-14T21:20:14.9291981Z * [new branch] gh/guilhermeleobas/167/orig -> origin/gh/guilhermeleobas/167/orig 2025-08-14T21:20:14.9294405Z * [new branch] gh/guilhermeleobas/168/base -> origin/gh/guilhermeleobas/168/base 2025-08-14T21:20:14.9296245Z * [new branch] gh/guilhermeleobas/168/head -> origin/gh/guilhermeleobas/168/head 2025-08-14T21:20:14.9298088Z * [new branch] gh/guilhermeleobas/168/orig -> origin/gh/guilhermeleobas/168/orig 2025-08-14T21:20:14.9300622Z * [new branch] gh/guilhermeleobas/169/base -> origin/gh/guilhermeleobas/169/base 2025-08-14T21:20:14.9302662Z * [new branch] gh/guilhermeleobas/169/head -> origin/gh/guilhermeleobas/169/head 2025-08-14T21:20:14.9304471Z * [new branch] gh/guilhermeleobas/169/orig -> origin/gh/guilhermeleobas/169/orig 2025-08-14T21:20:14.9307176Z * [new branch] gh/guilhermeleobas/170/base -> origin/gh/guilhermeleobas/170/base 2025-08-14T21:20:14.9308715Z * [new branch] gh/guilhermeleobas/170/head -> origin/gh/guilhermeleobas/170/head 2025-08-14T21:20:14.9310518Z * [new branch] gh/guilhermeleobas/170/orig -> origin/gh/guilhermeleobas/170/orig 2025-08-14T21:20:14.9312939Z * [new branch] gh/guilhermeleobas/171/base -> origin/gh/guilhermeleobas/171/base 2025-08-14T21:20:14.9314699Z * [new branch] gh/guilhermeleobas/171/head -> origin/gh/guilhermeleobas/171/head 2025-08-14T21:20:14.9316630Z * [new branch] gh/guilhermeleobas/171/orig -> origin/gh/guilhermeleobas/171/orig 2025-08-14T21:20:14.9318979Z * [new branch] gh/guilhermeleobas/173/base -> origin/gh/guilhermeleobas/173/base 2025-08-14T21:20:14.9320744Z * [new branch] gh/guilhermeleobas/173/head -> origin/gh/guilhermeleobas/173/head 2025-08-14T21:20:14.9322504Z * [new branch] gh/guilhermeleobas/173/orig -> origin/gh/guilhermeleobas/173/orig 2025-08-14T21:20:14.9325140Z * [new branch] gh/guilhermeleobas/181/base -> origin/gh/guilhermeleobas/181/base 2025-08-14T21:20:14.9326918Z * [new branch] gh/guilhermeleobas/181/head -> origin/gh/guilhermeleobas/181/head 2025-08-14T21:20:14.9328741Z * [new branch] gh/guilhermeleobas/181/orig -> origin/gh/guilhermeleobas/181/orig 2025-08-14T21:20:14.9332493Z * [new branch] gh/guilhermeleobas/182/base -> origin/gh/guilhermeleobas/182/base 2025-08-14T21:20:14.9334277Z * [new branch] gh/guilhermeleobas/182/head -> origin/gh/guilhermeleobas/182/head 2025-08-14T21:20:14.9336200Z * [new branch] gh/guilhermeleobas/182/orig -> origin/gh/guilhermeleobas/182/orig 2025-08-14T21:20:14.9338745Z * [new branch] gh/guilhermeleobas/183/base -> origin/gh/guilhermeleobas/183/base 2025-08-14T21:20:14.9340564Z * [new branch] gh/guilhermeleobas/183/head -> origin/gh/guilhermeleobas/183/head 2025-08-14T21:20:14.9342312Z * [new branch] gh/guilhermeleobas/183/orig -> origin/gh/guilhermeleobas/183/orig 2025-08-14T21:20:14.9344842Z * [new branch] gh/guilhermeleobas/184/base -> origin/gh/guilhermeleobas/184/base 2025-08-14T21:20:14.9346660Z * [new branch] gh/guilhermeleobas/184/head -> origin/gh/guilhermeleobas/184/head 2025-08-14T21:20:14.9348865Z * [new branch] gh/guilhermeleobas/184/orig -> origin/gh/guilhermeleobas/184/orig 2025-08-14T21:20:14.9351262Z * [new branch] gh/guilhermeleobas/185/base -> origin/gh/guilhermeleobas/185/base 2025-08-14T21:20:14.9353051Z * [new branch] gh/guilhermeleobas/185/head -> origin/gh/guilhermeleobas/185/head 2025-08-14T21:20:14.9354889Z * [new branch] gh/guilhermeleobas/185/orig -> origin/gh/guilhermeleobas/185/orig 2025-08-14T21:20:14.9357492Z * [new branch] gh/guilhermeleobas/188/base -> origin/gh/guilhermeleobas/188/base 2025-08-14T21:20:14.9359230Z * [new branch] gh/guilhermeleobas/188/head -> origin/gh/guilhermeleobas/188/head 2025-08-14T21:20:14.9361640Z * [new branch] gh/guilhermeleobas/188/orig -> origin/gh/guilhermeleobas/188/orig 2025-08-14T21:20:14.9364239Z * [new branch] gh/guilhermeleobas/189/base -> origin/gh/guilhermeleobas/189/base 2025-08-14T21:20:14.9366295Z * [new branch] gh/guilhermeleobas/189/head -> origin/gh/guilhermeleobas/189/head 2025-08-14T21:20:14.9368041Z * [new branch] gh/guilhermeleobas/189/orig -> origin/gh/guilhermeleobas/189/orig 2025-08-14T21:20:14.9370656Z * [new branch] gh/guilhermeleobas/190/base -> origin/gh/guilhermeleobas/190/base 2025-08-14T21:20:14.9372359Z * [new branch] gh/guilhermeleobas/190/head -> origin/gh/guilhermeleobas/190/head 2025-08-14T21:20:14.9374136Z * [new branch] gh/guilhermeleobas/190/orig -> origin/gh/guilhermeleobas/190/orig 2025-08-14T21:20:14.9376699Z * [new branch] gh/guilhermeleobas/192/base -> origin/gh/guilhermeleobas/192/base 2025-08-14T21:20:14.9378493Z * [new branch] gh/guilhermeleobas/192/head -> origin/gh/guilhermeleobas/192/head 2025-08-14T21:20:14.9380214Z * [new branch] gh/guilhermeleobas/192/orig -> origin/gh/guilhermeleobas/192/orig 2025-08-14T21:20:14.9382750Z * [new branch] gh/guilhermeleobas/193/base -> origin/gh/guilhermeleobas/193/base 2025-08-14T21:20:14.9385097Z * [new branch] gh/guilhermeleobas/193/head -> origin/gh/guilhermeleobas/193/head 2025-08-14T21:20:14.9387286Z * [new branch] gh/guilhermeleobas/193/orig -> origin/gh/guilhermeleobas/193/orig 2025-08-14T21:20:14.9389591Z * [new branch] gh/guilhermeleobas/194/base -> origin/gh/guilhermeleobas/194/base 2025-08-14T21:20:14.9391452Z * [new branch] gh/guilhermeleobas/194/head -> origin/gh/guilhermeleobas/194/head 2025-08-14T21:20:14.9393101Z * [new branch] gh/guilhermeleobas/194/orig -> origin/gh/guilhermeleobas/194/orig 2025-08-14T21:20:14.9395675Z * [new branch] gh/guilhermeleobas/203/base -> origin/gh/guilhermeleobas/203/base 2025-08-14T21:20:14.9397402Z * [new branch] gh/guilhermeleobas/203/head -> origin/gh/guilhermeleobas/203/head 2025-08-14T21:20:14.9399200Z * [new branch] gh/guilhermeleobas/203/orig -> origin/gh/guilhermeleobas/203/orig 2025-08-14T21:20:14.9401733Z * [new branch] gh/guilhermeleobas/204/base -> origin/gh/guilhermeleobas/204/base 2025-08-14T21:20:14.9403485Z * [new branch] gh/guilhermeleobas/204/head -> origin/gh/guilhermeleobas/204/head 2025-08-14T21:20:14.9405450Z * [new branch] gh/guilhermeleobas/204/orig -> origin/gh/guilhermeleobas/204/orig 2025-08-14T21:20:14.9407991Z * [new branch] gh/guilhermeleobas/205/base -> origin/gh/guilhermeleobas/205/base 2025-08-14T21:20:14.9409755Z * [new branch] gh/guilhermeleobas/205/head -> origin/gh/guilhermeleobas/205/head 2025-08-14T21:20:14.9411637Z * [new branch] gh/guilhermeleobas/205/orig -> origin/gh/guilhermeleobas/205/orig 2025-08-14T21:20:14.9414096Z * [new branch] gh/guilhermeleobas/206/base -> origin/gh/guilhermeleobas/206/base 2025-08-14T21:20:14.9415930Z * [new branch] gh/guilhermeleobas/206/head -> origin/gh/guilhermeleobas/206/head 2025-08-14T21:20:14.9417693Z * [new branch] gh/guilhermeleobas/206/orig -> origin/gh/guilhermeleobas/206/orig 2025-08-14T21:20:14.9420248Z * [new branch] gh/guilhermeleobas/207/base -> origin/gh/guilhermeleobas/207/base 2025-08-14T21:20:14.9422097Z * [new branch] gh/guilhermeleobas/207/head -> origin/gh/guilhermeleobas/207/head 2025-08-14T21:20:14.9423821Z * [new branch] gh/guilhermeleobas/207/orig -> origin/gh/guilhermeleobas/207/orig 2025-08-14T21:20:14.9426269Z * [new branch] gh/guilhermeleobas/208/base -> origin/gh/guilhermeleobas/208/base 2025-08-14T21:20:14.9428088Z * [new branch] gh/guilhermeleobas/208/head -> origin/gh/guilhermeleobas/208/head 2025-08-14T21:20:14.9430040Z * [new branch] gh/guilhermeleobas/208/orig -> origin/gh/guilhermeleobas/208/orig 2025-08-14T21:20:14.9432536Z * [new branch] gh/guilhermeleobas/209/base -> origin/gh/guilhermeleobas/209/base 2025-08-14T21:20:14.9434324Z * [new branch] gh/guilhermeleobas/209/head -> origin/gh/guilhermeleobas/209/head 2025-08-14T21:20:14.9436263Z * [new branch] gh/guilhermeleobas/209/orig -> origin/gh/guilhermeleobas/209/orig 2025-08-14T21:20:14.9438857Z * [new branch] gh/guilhermeleobas/210/base -> origin/gh/guilhermeleobas/210/base 2025-08-14T21:20:14.9440578Z * [new branch] gh/guilhermeleobas/210/head -> origin/gh/guilhermeleobas/210/head 2025-08-14T21:20:14.9442325Z * [new branch] gh/guilhermeleobas/210/orig -> origin/gh/guilhermeleobas/210/orig 2025-08-14T21:20:14.9444820Z * [new branch] gh/guilhermeleobas/211/base -> origin/gh/guilhermeleobas/211/base 2025-08-14T21:20:14.9446659Z * [new branch] gh/guilhermeleobas/211/head -> origin/gh/guilhermeleobas/211/head 2025-08-14T21:20:14.9451600Z * [new branch] gh/guilhermeleobas/211/orig -> origin/gh/guilhermeleobas/211/orig 2025-08-14T21:20:14.9454138Z * [new branch] gh/guilhermeleobas/212/base -> origin/gh/guilhermeleobas/212/base 2025-08-14T21:20:14.9455952Z * [new branch] gh/guilhermeleobas/212/head -> origin/gh/guilhermeleobas/212/head 2025-08-14T21:20:14.9457838Z * [new branch] gh/guilhermeleobas/212/orig -> origin/gh/guilhermeleobas/212/orig 2025-08-14T21:20:14.9461078Z * [new branch] gh/guilhermeleobas/213/base -> origin/gh/guilhermeleobas/213/base 2025-08-14T21:20:14.9462738Z * [new branch] gh/guilhermeleobas/213/head -> origin/gh/guilhermeleobas/213/head 2025-08-14T21:20:14.9464593Z * [new branch] gh/guilhermeleobas/213/orig -> origin/gh/guilhermeleobas/213/orig 2025-08-14T21:20:14.9467367Z * [new branch] gh/guilhermeleobas/214/base -> origin/gh/guilhermeleobas/214/base 2025-08-14T21:20:14.9469164Z * [new branch] gh/guilhermeleobas/214/head -> origin/gh/guilhermeleobas/214/head 2025-08-14T21:20:14.9470950Z * [new branch] gh/guilhermeleobas/214/orig -> origin/gh/guilhermeleobas/214/orig 2025-08-14T21:20:14.9473463Z * [new branch] gh/guilhermeleobas/215/base -> origin/gh/guilhermeleobas/215/base 2025-08-14T21:20:14.9475262Z * [new branch] gh/guilhermeleobas/215/head -> origin/gh/guilhermeleobas/215/head 2025-08-14T21:20:14.9477040Z * [new branch] gh/guilhermeleobas/215/orig -> origin/gh/guilhermeleobas/215/orig 2025-08-14T21:20:14.9479540Z * [new branch] gh/guilhermeleobas/216/base -> origin/gh/guilhermeleobas/216/base 2025-08-14T21:20:14.9481405Z * [new branch] gh/guilhermeleobas/216/head -> origin/gh/guilhermeleobas/216/head 2025-08-14T21:20:14.9483197Z * [new branch] gh/guilhermeleobas/216/orig -> origin/gh/guilhermeleobas/216/orig 2025-08-14T21:20:14.9485913Z * [new branch] gh/guilhermeleobas/217/base -> origin/gh/guilhermeleobas/217/base 2025-08-14T21:20:14.9487618Z * [new branch] gh/guilhermeleobas/217/head -> origin/gh/guilhermeleobas/217/head 2025-08-14T21:20:14.9489441Z * [new branch] gh/guilhermeleobas/217/orig -> origin/gh/guilhermeleobas/217/orig 2025-08-14T21:20:14.9491995Z * [new branch] gh/guilhermeleobas/218/base -> origin/gh/guilhermeleobas/218/base 2025-08-14T21:20:14.9493751Z * [new branch] gh/guilhermeleobas/218/head -> origin/gh/guilhermeleobas/218/head 2025-08-14T21:20:14.9495623Z * [new branch] gh/guilhermeleobas/218/orig -> origin/gh/guilhermeleobas/218/orig 2025-08-14T21:20:14.9498130Z * [new branch] gh/guilhermeleobas/219/base -> origin/gh/guilhermeleobas/219/base 2025-08-14T21:20:14.9499874Z * [new branch] gh/guilhermeleobas/219/head -> origin/gh/guilhermeleobas/219/head 2025-08-14T21:20:14.9501681Z * [new branch] gh/guilhermeleobas/219/orig -> origin/gh/guilhermeleobas/219/orig 2025-08-14T21:20:14.9504206Z * [new branch] gh/guilhermeleobas/220/base -> origin/gh/guilhermeleobas/220/base 2025-08-14T21:20:14.9506016Z * [new branch] gh/guilhermeleobas/220/head -> origin/gh/guilhermeleobas/220/head 2025-08-14T21:20:14.9507747Z * [new branch] gh/guilhermeleobas/220/orig -> origin/gh/guilhermeleobas/220/orig 2025-08-14T21:20:14.9510314Z * [new branch] gh/guilhermeleobas/221/base -> origin/gh/guilhermeleobas/221/base 2025-08-14T21:20:14.9512134Z * [new branch] gh/guilhermeleobas/221/head -> origin/gh/guilhermeleobas/221/head 2025-08-14T21:20:14.9514035Z * [new branch] gh/guilhermeleobas/221/orig -> origin/gh/guilhermeleobas/221/orig 2025-08-14T21:20:14.9516514Z * [new branch] gh/guilhermeleobas/222/base -> origin/gh/guilhermeleobas/222/base 2025-08-14T21:20:14.9518343Z * [new branch] gh/guilhermeleobas/222/head -> origin/gh/guilhermeleobas/222/head 2025-08-14T21:20:14.9520819Z * [new branch] gh/guilhermeleobas/222/orig -> origin/gh/guilhermeleobas/222/orig 2025-08-14T21:20:14.9523444Z * [new branch] gh/guilhermeleobas/223/base -> origin/gh/guilhermeleobas/223/base 2025-08-14T21:20:14.9525315Z * [new branch] gh/guilhermeleobas/223/head -> origin/gh/guilhermeleobas/223/head 2025-08-14T21:20:14.9527095Z * [new branch] gh/guilhermeleobas/223/orig -> origin/gh/guilhermeleobas/223/orig 2025-08-14T21:20:14.9529743Z * [new branch] gh/guilhermeleobas/224/base -> origin/gh/guilhermeleobas/224/base 2025-08-14T21:20:14.9531442Z * [new branch] gh/guilhermeleobas/224/head -> origin/gh/guilhermeleobas/224/head 2025-08-14T21:20:14.9533186Z * [new branch] gh/guilhermeleobas/224/orig -> origin/gh/guilhermeleobas/224/orig 2025-08-14T21:20:14.9535682Z * [new branch] gh/guilhermeleobas/225/base -> origin/gh/guilhermeleobas/225/base 2025-08-14T21:20:14.9537429Z * [new branch] gh/guilhermeleobas/225/head -> origin/gh/guilhermeleobas/225/head 2025-08-14T21:20:14.9539353Z * [new branch] gh/guilhermeleobas/225/orig -> origin/gh/guilhermeleobas/225/orig 2025-08-14T21:20:14.9541752Z * [new branch] gh/guilhermeleobas/226/base -> origin/gh/guilhermeleobas/226/base 2025-08-14T21:20:14.9543499Z * [new branch] gh/guilhermeleobas/226/head -> origin/gh/guilhermeleobas/226/head 2025-08-14T21:20:14.9545303Z * [new branch] gh/guilhermeleobas/226/orig -> origin/gh/guilhermeleobas/226/orig 2025-08-14T21:20:14.9548082Z * [new branch] gh/guilhermeleobas/227/base -> origin/gh/guilhermeleobas/227/base 2025-08-14T21:20:14.9549791Z * [new branch] gh/guilhermeleobas/227/head -> origin/gh/guilhermeleobas/227/head 2025-08-14T21:20:14.9551599Z * [new branch] gh/guilhermeleobas/227/orig -> origin/gh/guilhermeleobas/227/orig 2025-08-14T21:20:14.9554165Z * [new branch] gh/guilhermeleobas/228/base -> origin/gh/guilhermeleobas/228/base 2025-08-14T21:20:14.9556013Z * [new branch] gh/guilhermeleobas/228/head -> origin/gh/guilhermeleobas/228/head 2025-08-14T21:20:14.9557702Z * [new branch] gh/guilhermeleobas/228/orig -> origin/gh/guilhermeleobas/228/orig 2025-08-14T21:20:14.9560221Z * [new branch] gh/guilhermeleobas/229/base -> origin/gh/guilhermeleobas/229/base 2025-08-14T21:20:14.9561964Z * [new branch] gh/guilhermeleobas/229/head -> origin/gh/guilhermeleobas/229/head 2025-08-14T21:20:14.9563821Z * [new branch] gh/guilhermeleobas/229/orig -> origin/gh/guilhermeleobas/229/orig 2025-08-14T21:20:14.9567133Z * [new branch] gh/guilhermeleobas/230/base -> origin/gh/guilhermeleobas/230/base 2025-08-14T21:20:14.9569007Z * [new branch] gh/guilhermeleobas/230/head -> origin/gh/guilhermeleobas/230/head 2025-08-14T21:20:14.9570745Z * [new branch] gh/guilhermeleobas/230/orig -> origin/gh/guilhermeleobas/230/orig 2025-08-14T21:20:14.9573259Z * [new branch] gh/guilhermeleobas/231/base -> origin/gh/guilhermeleobas/231/base 2025-08-14T21:20:14.9574995Z * [new branch] gh/guilhermeleobas/231/head -> origin/gh/guilhermeleobas/231/head 2025-08-14T21:20:14.9576869Z * [new branch] gh/guilhermeleobas/231/orig -> origin/gh/guilhermeleobas/231/orig 2025-08-14T21:20:14.9579387Z * [new branch] gh/guilhermeleobas/232/base -> origin/gh/guilhermeleobas/232/base 2025-08-14T21:20:14.9581183Z * [new branch] gh/guilhermeleobas/232/head -> origin/gh/guilhermeleobas/232/head 2025-08-14T21:20:14.9582978Z * [new branch] gh/guilhermeleobas/232/orig -> origin/gh/guilhermeleobas/232/orig 2025-08-14T21:20:14.9585596Z * [new branch] gh/guilhermeleobas/233/base -> origin/gh/guilhermeleobas/233/base 2025-08-14T21:20:14.9587356Z * [new branch] gh/guilhermeleobas/233/head -> origin/gh/guilhermeleobas/233/head 2025-08-14T21:20:14.9589328Z * [new branch] gh/guilhermeleobas/233/orig -> origin/gh/guilhermeleobas/233/orig 2025-08-14T21:20:14.9591898Z * [new branch] gh/guilhermeleobas/73/base -> origin/gh/guilhermeleobas/73/base 2025-08-14T21:20:14.9593656Z * [new branch] gh/guilhermeleobas/73/head -> origin/gh/guilhermeleobas/73/head 2025-08-14T21:20:14.9595439Z * [new branch] gh/guilhermeleobas/73/orig -> origin/gh/guilhermeleobas/73/orig 2025-08-14T21:20:14.9599169Z * [new branch] gh/henrylhtsang/103/base -> origin/gh/henrylhtsang/103/base 2025-08-14T21:20:14.9600885Z * [new branch] gh/henrylhtsang/103/head -> origin/gh/henrylhtsang/103/head 2025-08-14T21:20:14.9602656Z * [new branch] gh/henrylhtsang/103/orig -> origin/gh/henrylhtsang/103/orig 2025-08-14T21:20:14.9605517Z * [new branch] gh/henrylhtsang/108/base -> origin/gh/henrylhtsang/108/base 2025-08-14T21:20:14.9607360Z * [new branch] gh/henrylhtsang/108/head -> origin/gh/henrylhtsang/108/head 2025-08-14T21:20:14.9609109Z * [new branch] gh/henrylhtsang/108/orig -> origin/gh/henrylhtsang/108/orig 2025-08-14T21:20:14.9611662Z * [new branch] gh/henrylhtsang/118/base -> origin/gh/henrylhtsang/118/base 2025-08-14T21:20:14.9613523Z * [new branch] gh/henrylhtsang/118/head -> origin/gh/henrylhtsang/118/head 2025-08-14T21:20:14.9615331Z * [new branch] gh/henrylhtsang/118/orig -> origin/gh/henrylhtsang/118/orig 2025-08-14T21:20:14.9617939Z * [new branch] gh/henrylhtsang/123/base -> origin/gh/henrylhtsang/123/base 2025-08-14T21:20:14.9619796Z * [new branch] gh/henrylhtsang/123/head -> origin/gh/henrylhtsang/123/head 2025-08-14T21:20:14.9621582Z * [new branch] gh/henrylhtsang/123/orig -> origin/gh/henrylhtsang/123/orig 2025-08-14T21:20:14.9624114Z * [new branch] gh/henrylhtsang/124/base -> origin/gh/henrylhtsang/124/base 2025-08-14T21:20:14.9626017Z * [new branch] gh/henrylhtsang/124/head -> origin/gh/henrylhtsang/124/head 2025-08-14T21:20:14.9627886Z * [new branch] gh/henrylhtsang/124/orig -> origin/gh/henrylhtsang/124/orig 2025-08-14T21:20:14.9630379Z * [new branch] gh/henrylhtsang/125/base -> origin/gh/henrylhtsang/125/base 2025-08-14T21:20:14.9632155Z * [new branch] gh/henrylhtsang/125/head -> origin/gh/henrylhtsang/125/head 2025-08-14T21:20:14.9633924Z * [new branch] gh/henrylhtsang/125/orig -> origin/gh/henrylhtsang/125/orig 2025-08-14T21:20:14.9636182Z * [new branch] gh/henrylhtsang/126/base -> origin/gh/henrylhtsang/126/base 2025-08-14T21:20:14.9638086Z * [new branch] gh/henrylhtsang/126/head -> origin/gh/henrylhtsang/126/head 2025-08-14T21:20:14.9639859Z * [new branch] gh/henrylhtsang/126/orig -> origin/gh/henrylhtsang/126/orig 2025-08-14T21:20:14.9642329Z * [new branch] gh/henrylhtsang/127/base -> origin/gh/henrylhtsang/127/base 2025-08-14T21:20:14.9644153Z * [new branch] gh/henrylhtsang/127/head -> origin/gh/henrylhtsang/127/head 2025-08-14T21:20:14.9646040Z * [new branch] gh/henrylhtsang/127/orig -> origin/gh/henrylhtsang/127/orig 2025-08-14T21:20:14.9648929Z * [new branch] gh/henrylhtsang/128/base -> origin/gh/henrylhtsang/128/base 2025-08-14T21:20:14.9650660Z * [new branch] gh/henrylhtsang/128/head -> origin/gh/henrylhtsang/128/head 2025-08-14T21:20:14.9652446Z * [new branch] gh/henrylhtsang/128/orig -> origin/gh/henrylhtsang/128/orig 2025-08-14T21:20:14.9655098Z * [new branch] gh/henrylhtsang/129/base -> origin/gh/henrylhtsang/129/base 2025-08-14T21:20:14.9656875Z * [new branch] gh/henrylhtsang/129/head -> origin/gh/henrylhtsang/129/head 2025-08-14T21:20:14.9658600Z * [new branch] gh/henrylhtsang/129/orig -> origin/gh/henrylhtsang/129/orig 2025-08-14T21:20:14.9661142Z * [new branch] gh/henrylhtsang/130/base -> origin/gh/henrylhtsang/130/base 2025-08-14T21:20:14.9663052Z * [new branch] gh/henrylhtsang/130/head -> origin/gh/henrylhtsang/130/head 2025-08-14T21:20:14.9665409Z * [new branch] gh/henrylhtsang/131/base -> origin/gh/henrylhtsang/131/base 2025-08-14T21:20:14.9667210Z * [new branch] gh/henrylhtsang/131/head -> origin/gh/henrylhtsang/131/head 2025-08-14T21:20:14.9668880Z * [new branch] gh/henrylhtsang/131/orig -> origin/gh/henrylhtsang/131/orig 2025-08-14T21:20:14.9671365Z * [new branch] gh/henrylhtsang/132/base -> origin/gh/henrylhtsang/132/base 2025-08-14T21:20:14.9673136Z * [new branch] gh/henrylhtsang/132/head -> origin/gh/henrylhtsang/132/head 2025-08-14T21:20:14.9674973Z * [new branch] gh/henrylhtsang/132/orig -> origin/gh/henrylhtsang/132/orig 2025-08-14T21:20:14.9677507Z * [new branch] gh/henrylhtsang/133/base -> origin/gh/henrylhtsang/133/base 2025-08-14T21:20:14.9679305Z * [new branch] gh/henrylhtsang/133/head -> origin/gh/henrylhtsang/133/head 2025-08-14T21:20:14.9681074Z * [new branch] gh/henrylhtsang/133/orig -> origin/gh/henrylhtsang/133/orig 2025-08-14T21:20:14.9683581Z * [new branch] gh/henrylhtsang/134/base -> origin/gh/henrylhtsang/134/base 2025-08-14T21:20:14.9685650Z * [new branch] gh/henrylhtsang/134/head -> origin/gh/henrylhtsang/134/head 2025-08-14T21:20:14.9687538Z * [new branch] gh/henrylhtsang/134/orig -> origin/gh/henrylhtsang/134/orig 2025-08-14T21:20:14.9689981Z * [new branch] gh/henrylhtsang/135/base -> origin/gh/henrylhtsang/135/base 2025-08-14T21:20:14.9691826Z * [new branch] gh/henrylhtsang/135/head -> origin/gh/henrylhtsang/135/head 2025-08-14T21:20:14.9693596Z * [new branch] gh/henrylhtsang/135/orig -> origin/gh/henrylhtsang/135/orig 2025-08-14T21:20:14.9697256Z * [new branch] gh/henrylhtsang/136/base -> origin/gh/henrylhtsang/136/base 2025-08-14T21:20:14.9699084Z * [new branch] gh/henrylhtsang/136/head -> origin/gh/henrylhtsang/136/head 2025-08-14T21:20:14.9700942Z * [new branch] gh/henrylhtsang/136/orig -> origin/gh/henrylhtsang/136/orig 2025-08-14T21:20:14.9703324Z * [new branch] gh/henrylhtsang/137/base -> origin/gh/henrylhtsang/137/base 2025-08-14T21:20:14.9705112Z * [new branch] gh/henrylhtsang/137/head -> origin/gh/henrylhtsang/137/head 2025-08-14T21:20:14.9706825Z * [new branch] gh/henrylhtsang/137/orig -> origin/gh/henrylhtsang/137/orig 2025-08-14T21:20:14.9709385Z * [new branch] gh/henrylhtsang/138/base -> origin/gh/henrylhtsang/138/base 2025-08-14T21:20:14.9711159Z * [new branch] gh/henrylhtsang/138/head -> origin/gh/henrylhtsang/138/head 2025-08-14T21:20:14.9712975Z * [new branch] gh/henrylhtsang/138/orig -> origin/gh/henrylhtsang/138/orig 2025-08-14T21:20:14.9715476Z * [new branch] gh/henrylhtsang/139/base -> origin/gh/henrylhtsang/139/base 2025-08-14T21:20:14.9717256Z * [new branch] gh/henrylhtsang/139/head -> origin/gh/henrylhtsang/139/head 2025-08-14T21:20:14.9719120Z * [new branch] gh/henrylhtsang/139/orig -> origin/gh/henrylhtsang/139/orig 2025-08-14T21:20:14.9721705Z * [new branch] gh/henrylhtsang/140/base -> origin/gh/henrylhtsang/140/base 2025-08-14T21:20:14.9723586Z * [new branch] gh/henrylhtsang/140/head -> origin/gh/henrylhtsang/140/head 2025-08-14T21:20:14.9725567Z * [new branch] gh/henrylhtsang/140/orig -> origin/gh/henrylhtsang/140/orig 2025-08-14T21:20:14.9728041Z * [new branch] gh/henrylhtsang/141/base -> origin/gh/henrylhtsang/141/base 2025-08-14T21:20:14.9729847Z * [new branch] gh/henrylhtsang/141/head -> origin/gh/henrylhtsang/141/head 2025-08-14T21:20:14.9731583Z * [new branch] gh/henrylhtsang/141/orig -> origin/gh/henrylhtsang/141/orig 2025-08-14T21:20:14.9734856Z * [new branch] gh/henrylhtsang/142/base -> origin/gh/henrylhtsang/142/base 2025-08-14T21:20:14.9736428Z * [new branch] gh/henrylhtsang/142/head -> origin/gh/henrylhtsang/142/head 2025-08-14T21:20:14.9738415Z * [new branch] gh/henrylhtsang/142/orig -> origin/gh/henrylhtsang/142/orig 2025-08-14T21:20:14.9740830Z * [new branch] gh/henrylhtsang/143/base -> origin/gh/henrylhtsang/143/base 2025-08-14T21:20:14.9742611Z * [new branch] gh/henrylhtsang/143/head -> origin/gh/henrylhtsang/143/head 2025-08-14T21:20:14.9744347Z * [new branch] gh/henrylhtsang/143/orig -> origin/gh/henrylhtsang/143/orig 2025-08-14T21:20:14.9746948Z * [new branch] gh/henrylhtsang/144/base -> origin/gh/henrylhtsang/144/base 2025-08-14T21:20:14.9749031Z * [new branch] gh/henrylhtsang/144/head -> origin/gh/henrylhtsang/144/head 2025-08-14T21:20:14.9750723Z * [new branch] gh/henrylhtsang/144/orig -> origin/gh/henrylhtsang/144/orig 2025-08-14T21:20:14.9753297Z * [new branch] gh/henrylhtsang/145/base -> origin/gh/henrylhtsang/145/base 2025-08-14T21:20:14.9755131Z * [new branch] gh/henrylhtsang/145/head -> origin/gh/henrylhtsang/145/head 2025-08-14T21:20:14.9756990Z * [new branch] gh/henrylhtsang/145/orig -> origin/gh/henrylhtsang/145/orig 2025-08-14T21:20:14.9759576Z * [new branch] gh/henrylhtsang/146/base -> origin/gh/henrylhtsang/146/base 2025-08-14T21:20:14.9761428Z * [new branch] gh/henrylhtsang/146/head -> origin/gh/henrylhtsang/146/head 2025-08-14T21:20:14.9763230Z * [new branch] gh/henrylhtsang/146/orig -> origin/gh/henrylhtsang/146/orig 2025-08-14T21:20:14.9766445Z * [new branch] gh/huydhn/1/head -> origin/gh/huydhn/1/head 2025-08-14T21:20:14.9768076Z * [new branch] gh/huydhn/1/next -> origin/gh/huydhn/1/next 2025-08-14T21:20:14.9770947Z * [new branch] gh/huydhn/2/head -> origin/gh/huydhn/2/head 2025-08-14T21:20:14.9772552Z * [new branch] gh/huydhn/2/next -> origin/gh/huydhn/2/next 2025-08-14T21:20:14.9774417Z * [new branch] gh/huydhn/2/orig -> origin/gh/huydhn/2/orig 2025-08-14T21:20:14.9776835Z * [new branch] gh/huydhn/3/head -> origin/gh/huydhn/3/head 2025-08-14T21:20:14.9778465Z * [new branch] gh/huydhn/3/next -> origin/gh/huydhn/3/next 2025-08-14T21:20:14.9780297Z * [new branch] gh/huydhn/3/orig -> origin/gh/huydhn/3/orig 2025-08-14T21:20:14.9782719Z * [new branch] gh/huydhn/4/head -> origin/gh/huydhn/4/head 2025-08-14T21:20:14.9784372Z * [new branch] gh/huydhn/4/next -> origin/gh/huydhn/4/next 2025-08-14T21:20:14.9786317Z * [new branch] gh/huydhn/4/orig -> origin/gh/huydhn/4/orig 2025-08-14T21:20:14.9788710Z * [new branch] gh/huydhn/5/head -> origin/gh/huydhn/5/head 2025-08-14T21:20:14.9790380Z * [new branch] gh/huydhn/5/next -> origin/gh/huydhn/5/next 2025-08-14T21:20:14.9792112Z * [new branch] gh/huydhn/5/orig -> origin/gh/huydhn/5/orig 2025-08-14T21:20:14.9794678Z * [new branch] gh/huydhn/6/head -> origin/gh/huydhn/6/head 2025-08-14T21:20:14.9796503Z * [new branch] gh/huydhn/6/next -> origin/gh/huydhn/6/next 2025-08-14T21:20:14.9798159Z * [new branch] gh/huydhn/6/orig -> origin/gh/huydhn/6/orig 2025-08-14T21:20:14.9801214Z * [new branch] gh/int3/97/base -> origin/gh/int3/97/base 2025-08-14T21:20:14.9803016Z * [new branch] gh/int3/97/head -> origin/gh/int3/97/head 2025-08-14T21:20:14.9806125Z * [new branch] gh/isuruf/101/base -> origin/gh/isuruf/101/base 2025-08-14T21:20:14.9808546Z * [new branch] gh/isuruf/101/head -> origin/gh/isuruf/101/head 2025-08-14T21:20:14.9810942Z * [new branch] gh/isuruf/116/base -> origin/gh/isuruf/116/base 2025-08-14T21:20:14.9812692Z * [new branch] gh/isuruf/116/head -> origin/gh/isuruf/116/head 2025-08-14T21:20:14.9814770Z * [new branch] gh/isuruf/116/orig -> origin/gh/isuruf/116/orig 2025-08-14T21:20:14.9817138Z * [new branch] gh/isuruf/141/base -> origin/gh/isuruf/141/base 2025-08-14T21:20:14.9818860Z * [new branch] gh/isuruf/141/head -> origin/gh/isuruf/141/head 2025-08-14T21:20:14.9820606Z * [new branch] gh/isuruf/141/orig -> origin/gh/isuruf/141/orig 2025-08-14T21:20:14.9823091Z * [new branch] gh/isuruf/142/base -> origin/gh/isuruf/142/base 2025-08-14T21:20:14.9824882Z * [new branch] gh/isuruf/142/head -> origin/gh/isuruf/142/head 2025-08-14T21:20:14.9826651Z * [new branch] gh/isuruf/142/orig -> origin/gh/isuruf/142/orig 2025-08-14T21:20:14.9829108Z * [new branch] gh/isuruf/81/base -> origin/gh/isuruf/81/base 2025-08-14T21:20:14.9830871Z * [new branch] gh/isuruf/81/head -> origin/gh/isuruf/81/head 2025-08-14T21:20:14.9832771Z * [new branch] gh/isuruf/81/orig -> origin/gh/isuruf/81/orig 2025-08-14T21:20:14.9836150Z * [new branch] gh/jamesjwu/140/base -> origin/gh/jamesjwu/140/base 2025-08-14T21:20:14.9837939Z * [new branch] gh/jamesjwu/140/head -> origin/gh/jamesjwu/140/head 2025-08-14T21:20:14.9839739Z * [new branch] gh/jamesjwu/140/orig -> origin/gh/jamesjwu/140/orig 2025-08-14T21:20:14.9842206Z * [new branch] gh/jamesjwu/150/base -> origin/gh/jamesjwu/150/base 2025-08-14T21:20:14.9843959Z * [new branch] gh/jamesjwu/150/head -> origin/gh/jamesjwu/150/head 2025-08-14T21:20:14.9845841Z * [new branch] gh/jamesjwu/150/orig -> origin/gh/jamesjwu/150/orig 2025-08-14T21:20:14.9848742Z * [new branch] gh/jamesjwu/154/base -> origin/gh/jamesjwu/154/base 2025-08-14T21:20:14.9850429Z * [new branch] gh/jamesjwu/154/head -> origin/gh/jamesjwu/154/head 2025-08-14T21:20:14.9852147Z * [new branch] gh/jamesjwu/154/orig -> origin/gh/jamesjwu/154/orig 2025-08-14T21:20:14.9854628Z * [new branch] gh/jamesjwu/155/base -> origin/gh/jamesjwu/155/base 2025-08-14T21:20:14.9856539Z * [new branch] gh/jamesjwu/155/head -> origin/gh/jamesjwu/155/head 2025-08-14T21:20:14.9858316Z * [new branch] gh/jamesjwu/155/orig -> origin/gh/jamesjwu/155/orig 2025-08-14T21:20:14.9860708Z * [new branch] gh/jamesjwu/159/base -> origin/gh/jamesjwu/159/base 2025-08-14T21:20:14.9862543Z * [new branch] gh/jamesjwu/159/head -> origin/gh/jamesjwu/159/head 2025-08-14T21:20:14.9864257Z * [new branch] gh/jamesjwu/159/orig -> origin/gh/jamesjwu/159/orig 2025-08-14T21:20:14.9867065Z * [new branch] gh/jamesjwu/163/base -> origin/gh/jamesjwu/163/base 2025-08-14T21:20:14.9868899Z * [new branch] gh/jamesjwu/163/head -> origin/gh/jamesjwu/163/head 2025-08-14T21:20:14.9870636Z * [new branch] gh/jamesjwu/163/orig -> origin/gh/jamesjwu/163/orig 2025-08-14T21:20:14.9873053Z * [new branch] gh/jamesjwu/171/base -> origin/gh/jamesjwu/171/base 2025-08-14T21:20:14.9874848Z * [new branch] gh/jamesjwu/171/head -> origin/gh/jamesjwu/171/head 2025-08-14T21:20:14.9876639Z * [new branch] gh/jamesjwu/171/orig -> origin/gh/jamesjwu/171/orig 2025-08-14T21:20:14.9879688Z * [new branch] gh/jamesjwu/174/base -> origin/gh/jamesjwu/174/base 2025-08-14T21:20:14.9881573Z * [new branch] gh/jamesjwu/174/head -> origin/gh/jamesjwu/174/head 2025-08-14T21:20:14.9883357Z * [new branch] gh/jamesjwu/174/orig -> origin/gh/jamesjwu/174/orig 2025-08-14T21:20:14.9886027Z * [new branch] gh/jamesjwu/175/base -> origin/gh/jamesjwu/175/base 2025-08-14T21:20:14.9887900Z * [new branch] gh/jamesjwu/175/head -> origin/gh/jamesjwu/175/head 2025-08-14T21:20:14.9889474Z * [new branch] gh/jamesjwu/175/orig -> origin/gh/jamesjwu/175/orig 2025-08-14T21:20:14.9891921Z * [new branch] gh/jamesjwu/176/base -> origin/gh/jamesjwu/176/base 2025-08-14T21:20:14.9893733Z * [new branch] gh/jamesjwu/176/head -> origin/gh/jamesjwu/176/head 2025-08-14T21:20:14.9895511Z * [new branch] gh/jamesjwu/176/orig -> origin/gh/jamesjwu/176/orig 2025-08-14T21:20:14.9898182Z * [new branch] gh/jamesjwu/177/base -> origin/gh/jamesjwu/177/base 2025-08-14T21:20:14.9900216Z * [new branch] gh/jamesjwu/177/head -> origin/gh/jamesjwu/177/head 2025-08-14T21:20:14.9901975Z * [new branch] gh/jamesjwu/177/orig -> origin/gh/jamesjwu/177/orig 2025-08-14T21:20:14.9904485Z * [new branch] gh/jamesjwu/178/base -> origin/gh/jamesjwu/178/base 2025-08-14T21:20:14.9906377Z * [new branch] gh/jamesjwu/178/head -> origin/gh/jamesjwu/178/head 2025-08-14T21:20:14.9908179Z * [new branch] gh/jamesjwu/178/orig -> origin/gh/jamesjwu/178/orig 2025-08-14T21:20:14.9910633Z * [new branch] gh/jamesjwu/179/base -> origin/gh/jamesjwu/179/base 2025-08-14T21:20:14.9912394Z * [new branch] gh/jamesjwu/179/head -> origin/gh/jamesjwu/179/head 2025-08-14T21:20:14.9914237Z * [new branch] gh/jamesjwu/179/orig -> origin/gh/jamesjwu/179/orig 2025-08-14T21:20:14.9916710Z * [new branch] gh/jamesjwu/180/base -> origin/gh/jamesjwu/180/base 2025-08-14T21:20:14.9918431Z * [new branch] gh/jamesjwu/180/head -> origin/gh/jamesjwu/180/head 2025-08-14T21:20:14.9920188Z * [new branch] gh/jamesjwu/180/orig -> origin/gh/jamesjwu/180/orig 2025-08-14T21:20:14.9922649Z * [new branch] gh/jamesjwu/181/base -> origin/gh/jamesjwu/181/base 2025-08-14T21:20:14.9924522Z * [new branch] gh/jamesjwu/181/head -> origin/gh/jamesjwu/181/head 2025-08-14T21:20:14.9926353Z * [new branch] gh/jamesjwu/181/orig -> origin/gh/jamesjwu/181/orig 2025-08-14T21:20:14.9928779Z * [new branch] gh/jamesjwu/182/base -> origin/gh/jamesjwu/182/base 2025-08-14T21:20:14.9930682Z * [new branch] gh/jamesjwu/182/head -> origin/gh/jamesjwu/182/head 2025-08-14T21:20:14.9932417Z * [new branch] gh/jamesjwu/182/orig -> origin/gh/jamesjwu/182/orig 2025-08-14T21:20:14.9935332Z * [new branch] gh/jamesjwu/183/base -> origin/gh/jamesjwu/183/base 2025-08-14T21:20:14.9937087Z * [new branch] gh/jamesjwu/183/head -> origin/gh/jamesjwu/183/head 2025-08-14T21:20:14.9938902Z * [new branch] gh/jamesjwu/183/orig -> origin/gh/jamesjwu/183/orig 2025-08-14T21:20:14.9941403Z * [new branch] gh/jamesjwu/184/base -> origin/gh/jamesjwu/184/base 2025-08-14T21:20:14.9943126Z * [new branch] gh/jamesjwu/184/head -> origin/gh/jamesjwu/184/head 2025-08-14T21:20:14.9944904Z * [new branch] gh/jamesjwu/184/orig -> origin/gh/jamesjwu/184/orig 2025-08-14T21:20:14.9947669Z * [new branch] gh/jamesjwu/52/base -> origin/gh/jamesjwu/52/base 2025-08-14T21:20:14.9949706Z * [new branch] gh/jamesjwu/52/head -> origin/gh/jamesjwu/52/head 2025-08-14T21:20:14.9952046Z * [new branch] gh/jamesjwu/53/base -> origin/gh/jamesjwu/53/base 2025-08-14T21:20:14.9953804Z * [new branch] gh/jamesjwu/53/head -> origin/gh/jamesjwu/53/head 2025-08-14T21:20:14.9956293Z * [new branch] gh/jamesjwu/54/base -> origin/gh/jamesjwu/54/base 2025-08-14T21:20:14.9957994Z * [new branch] gh/jamesjwu/54/head -> origin/gh/jamesjwu/54/head 2025-08-14T21:20:14.9960574Z * [new branch] gh/jamesjwu/55/base -> origin/gh/jamesjwu/55/base 2025-08-14T21:20:14.9962132Z * [new branch] gh/jamesjwu/55/head -> origin/gh/jamesjwu/55/head 2025-08-14T21:20:14.9964610Z * [new branch] gh/jamesjwu/56/base -> origin/gh/jamesjwu/56/base 2025-08-14T21:20:14.9966343Z * [new branch] gh/jamesjwu/56/head -> origin/gh/jamesjwu/56/head 2025-08-14T21:20:14.9968714Z * [new branch] gh/jamesjwu/57/base -> origin/gh/jamesjwu/57/base 2025-08-14T21:20:14.9970532Z * [new branch] gh/jamesjwu/57/head -> origin/gh/jamesjwu/57/head 2025-08-14T21:20:14.9973250Z * [new branch] gh/jamesjwu/58/base -> origin/gh/jamesjwu/58/base 2025-08-14T21:20:14.9974953Z * [new branch] gh/jamesjwu/58/head -> origin/gh/jamesjwu/58/head 2025-08-14T21:20:14.9977501Z * [new branch] gh/jamesjwu/59/base -> origin/gh/jamesjwu/59/base 2025-08-14T21:20:14.9979240Z * [new branch] gh/jamesjwu/59/head -> origin/gh/jamesjwu/59/head 2025-08-14T21:20:14.9981641Z * [new branch] gh/jamesjwu/60/base -> origin/gh/jamesjwu/60/base 2025-08-14T21:20:14.9983347Z * [new branch] gh/jamesjwu/60/head -> origin/gh/jamesjwu/60/head 2025-08-14T21:20:14.9985725Z * [new branch] gh/jamesjwu/61/base -> origin/gh/jamesjwu/61/base 2025-08-14T21:20:14.9987465Z * [new branch] gh/jamesjwu/61/head -> origin/gh/jamesjwu/61/head 2025-08-14T21:20:14.9989866Z * [new branch] gh/jamesjwu/62/base -> origin/gh/jamesjwu/62/base 2025-08-14T21:20:14.9991543Z * [new branch] gh/jamesjwu/62/head -> origin/gh/jamesjwu/62/head 2025-08-14T21:20:14.9993930Z * [new branch] gh/jamesjwu/63/base -> origin/gh/jamesjwu/63/base 2025-08-14T21:20:14.9995732Z * [new branch] gh/jamesjwu/63/head -> origin/gh/jamesjwu/63/head 2025-08-14T21:20:14.9998376Z * [new branch] gh/jamesjwu/64/base -> origin/gh/jamesjwu/64/base 2025-08-14T21:20:15.0000250Z * [new branch] gh/jamesjwu/64/head -> origin/gh/jamesjwu/64/head 2025-08-14T21:20:15.0002611Z * [new branch] gh/jamesjwu/65/base -> origin/gh/jamesjwu/65/base 2025-08-14T21:20:15.0004303Z * [new branch] gh/jamesjwu/65/head -> origin/gh/jamesjwu/65/head 2025-08-14T21:20:15.0007510Z * [new branch] gh/janeyx99/165/base -> origin/gh/janeyx99/165/base 2025-08-14T21:20:15.0009313Z * [new branch] gh/janeyx99/165/head -> origin/gh/janeyx99/165/head 2025-08-14T21:20:15.0011064Z * [new branch] gh/janeyx99/165/orig -> origin/gh/janeyx99/165/orig 2025-08-14T21:20:15.0013903Z * [new branch] gh/janeyx99/201/base -> origin/gh/janeyx99/201/base 2025-08-14T21:20:15.0015785Z * [new branch] gh/janeyx99/201/head -> origin/gh/janeyx99/201/head 2025-08-14T21:20:15.0017574Z * [new branch] gh/janeyx99/201/orig -> origin/gh/janeyx99/201/orig 2025-08-14T21:20:15.0020547Z * [new branch] gh/janeyx99/225/base -> origin/gh/janeyx99/225/base 2025-08-14T21:20:15.0022435Z * [new branch] gh/janeyx99/225/head -> origin/gh/janeyx99/225/head 2025-08-14T21:20:15.0024280Z * [new branch] gh/janeyx99/225/orig -> origin/gh/janeyx99/225/orig 2025-08-14T21:20:15.0026800Z * [new branch] gh/janeyx99/256/base -> origin/gh/janeyx99/256/base 2025-08-14T21:20:15.0028612Z * [new branch] gh/janeyx99/256/head -> origin/gh/janeyx99/256/head 2025-08-14T21:20:15.0030366Z * [new branch] gh/janeyx99/256/orig -> origin/gh/janeyx99/256/orig 2025-08-14T21:20:15.0032898Z * [new branch] gh/janeyx99/268/base -> origin/gh/janeyx99/268/base 2025-08-14T21:20:15.0034782Z * [new branch] gh/janeyx99/268/head -> origin/gh/janeyx99/268/head 2025-08-14T21:20:15.0036687Z * [new branch] gh/janeyx99/268/orig -> origin/gh/janeyx99/268/orig 2025-08-14T21:20:15.0038818Z * [new branch] gh/janeyx99/269/base -> origin/gh/janeyx99/269/base 2025-08-14T21:20:15.0040648Z * [new branch] gh/janeyx99/269/head -> origin/gh/janeyx99/269/head 2025-08-14T21:20:15.0042397Z * [new branch] gh/janeyx99/269/orig -> origin/gh/janeyx99/269/orig 2025-08-14T21:20:15.0044917Z * [new branch] gh/janeyx99/274/base -> origin/gh/janeyx99/274/base 2025-08-14T21:20:15.0046714Z * [new branch] gh/janeyx99/274/head -> origin/gh/janeyx99/274/head 2025-08-14T21:20:15.0049315Z * [new branch] gh/janeyx99/274/orig -> origin/gh/janeyx99/274/orig 2025-08-14T21:20:15.0051828Z * [new branch] gh/janeyx99/276/base -> origin/gh/janeyx99/276/base 2025-08-14T21:20:15.0053615Z * [new branch] gh/janeyx99/276/head -> origin/gh/janeyx99/276/head 2025-08-14T21:20:15.0055615Z * [new branch] gh/janeyx99/276/orig -> origin/gh/janeyx99/276/orig 2025-08-14T21:20:15.0058025Z * [new branch] gh/janeyx99/277/base -> origin/gh/janeyx99/277/base 2025-08-14T21:20:15.0059822Z * [new branch] gh/janeyx99/277/head -> origin/gh/janeyx99/277/head 2025-08-14T21:20:15.0061616Z * [new branch] gh/janeyx99/277/orig -> origin/gh/janeyx99/277/orig 2025-08-14T21:20:15.0064126Z * [new branch] gh/janeyx99/278/base -> origin/gh/janeyx99/278/base 2025-08-14T21:20:15.0066017Z * [new branch] gh/janeyx99/278/head -> origin/gh/janeyx99/278/head 2025-08-14T21:20:15.0067862Z * [new branch] gh/janeyx99/278/orig -> origin/gh/janeyx99/278/orig 2025-08-14T21:20:15.0070392Z * [new branch] gh/janeyx99/279/base -> origin/gh/janeyx99/279/base 2025-08-14T21:20:15.0072180Z * [new branch] gh/janeyx99/279/head -> origin/gh/janeyx99/279/head 2025-08-14T21:20:15.0074003Z * [new branch] gh/janeyx99/279/orig -> origin/gh/janeyx99/279/orig 2025-08-14T21:20:15.0076527Z * [new branch] gh/janeyx99/280/base -> origin/gh/janeyx99/280/base 2025-08-14T21:20:15.0078809Z * [new branch] gh/janeyx99/280/head -> origin/gh/janeyx99/280/head 2025-08-14T21:20:15.0080544Z * [new branch] gh/janeyx99/280/orig -> origin/gh/janeyx99/280/orig 2025-08-14T21:20:15.0082416Z * [new branch] gh/janeyx99/281/base -> origin/gh/janeyx99/281/base 2025-08-14T21:20:15.0084138Z * [new branch] gh/janeyx99/281/head -> origin/gh/janeyx99/281/head 2025-08-14T21:20:15.0086080Z * [new branch] gh/janeyx99/281/orig -> origin/gh/janeyx99/281/orig 2025-08-14T21:20:15.0088638Z * [new branch] gh/janeyx99/282/base -> origin/gh/janeyx99/282/base 2025-08-14T21:20:15.0090333Z * [new branch] gh/janeyx99/282/head -> origin/gh/janeyx99/282/head 2025-08-14T21:20:15.0092072Z * [new branch] gh/janeyx99/282/orig -> origin/gh/janeyx99/282/orig 2025-08-14T21:20:15.0094560Z * [new branch] gh/janeyx99/283/base -> origin/gh/janeyx99/283/base 2025-08-14T21:20:15.0096490Z * [new branch] gh/janeyx99/283/head -> origin/gh/janeyx99/283/head 2025-08-14T21:20:15.0098438Z * [new branch] gh/janeyx99/283/orig -> origin/gh/janeyx99/283/orig 2025-08-14T21:20:15.0101149Z * [new branch] gh/janeyx99/284/base -> origin/gh/janeyx99/284/base 2025-08-14T21:20:15.0102896Z * [new branch] gh/janeyx99/284/head -> origin/gh/janeyx99/284/head 2025-08-14T21:20:15.0104695Z * [new branch] gh/janeyx99/284/orig -> origin/gh/janeyx99/284/orig 2025-08-14T21:20:15.0107845Z * [new branch] gh/janeyx99/285/base -> origin/gh/janeyx99/285/base 2025-08-14T21:20:15.0109502Z * [new branch] gh/janeyx99/285/head -> origin/gh/janeyx99/285/head 2025-08-14T21:20:15.0111295Z * [new branch] gh/janeyx99/285/orig -> origin/gh/janeyx99/285/orig 2025-08-14T21:20:15.0113936Z * [new branch] gh/janeyx99/286/base -> origin/gh/janeyx99/286/base 2025-08-14T21:20:15.0115731Z * [new branch] gh/janeyx99/286/head -> origin/gh/janeyx99/286/head 2025-08-14T21:20:15.0117489Z * [new branch] gh/janeyx99/286/orig -> origin/gh/janeyx99/286/orig 2025-08-14T21:20:15.0120056Z * [new branch] gh/janeyx99/287/base -> origin/gh/janeyx99/287/base 2025-08-14T21:20:15.0121850Z * [new branch] gh/janeyx99/287/head -> origin/gh/janeyx99/287/head 2025-08-14T21:20:15.0123703Z * [new branch] gh/janeyx99/287/orig -> origin/gh/janeyx99/287/orig 2025-08-14T21:20:15.0126440Z * [new branch] gh/janeyx99/288/base -> origin/gh/janeyx99/288/base 2025-08-14T21:20:15.0128238Z * [new branch] gh/janeyx99/288/head -> origin/gh/janeyx99/288/head 2025-08-14T21:20:15.0129940Z * [new branch] gh/janeyx99/288/orig -> origin/gh/janeyx99/288/orig 2025-08-14T21:20:15.0132507Z * [new branch] gh/janeyx99/289/base -> origin/gh/janeyx99/289/base 2025-08-14T21:20:15.0134255Z * [new branch] gh/janeyx99/289/head -> origin/gh/janeyx99/289/head 2025-08-14T21:20:15.0136017Z * [new branch] gh/janeyx99/289/orig -> origin/gh/janeyx99/289/orig 2025-08-14T21:20:15.0138732Z * [new branch] gh/janeyx99/290/base -> origin/gh/janeyx99/290/base 2025-08-14T21:20:15.0140445Z * [new branch] gh/janeyx99/290/head -> origin/gh/janeyx99/290/head 2025-08-14T21:20:15.0142225Z * [new branch] gh/janeyx99/290/orig -> origin/gh/janeyx99/290/orig 2025-08-14T21:20:15.0144805Z * [new branch] gh/janeyx99/291/base -> origin/gh/janeyx99/291/base 2025-08-14T21:20:15.0146633Z * [new branch] gh/janeyx99/291/head -> origin/gh/janeyx99/291/head 2025-08-14T21:20:15.0151068Z * [new branch] gh/janeyx99/291/orig -> origin/gh/janeyx99/291/orig 2025-08-14T21:20:15.0153629Z * [new branch] gh/janeyx99/292/base -> origin/gh/janeyx99/292/base 2025-08-14T21:20:15.0155446Z * [new branch] gh/janeyx99/292/head -> origin/gh/janeyx99/292/head 2025-08-14T21:20:15.0157362Z * [new branch] gh/janeyx99/292/orig -> origin/gh/janeyx99/292/orig 2025-08-14T21:20:15.0159876Z * [new branch] gh/janeyx99/293/base -> origin/gh/janeyx99/293/base 2025-08-14T21:20:15.0161633Z * [new branch] gh/janeyx99/293/head -> origin/gh/janeyx99/293/head 2025-08-14T21:20:15.0163454Z * [new branch] gh/janeyx99/293/orig -> origin/gh/janeyx99/293/orig 2025-08-14T21:20:15.0166300Z * [new branch] gh/janeyx99/294/base -> origin/gh/janeyx99/294/base 2025-08-14T21:20:15.0168033Z * [new branch] gh/janeyx99/294/head -> origin/gh/janeyx99/294/head 2025-08-14T21:20:15.0169804Z * [new branch] gh/janeyx99/294/orig -> origin/gh/janeyx99/294/orig 2025-08-14T21:20:15.0172354Z * [new branch] gh/janeyx99/295/base -> origin/gh/janeyx99/295/base 2025-08-14T21:20:15.0174672Z * [new branch] gh/janeyx99/295/head -> origin/gh/janeyx99/295/head 2025-08-14T21:20:15.0176560Z * [new branch] gh/janeyx99/295/orig -> origin/gh/janeyx99/295/orig 2025-08-14T21:20:15.0179107Z * [new branch] gh/janeyx99/296/base -> origin/gh/janeyx99/296/base 2025-08-14T21:20:15.0180886Z * [new branch] gh/janeyx99/296/head -> origin/gh/janeyx99/296/head 2025-08-14T21:20:15.0182771Z * [new branch] gh/janeyx99/296/orig -> origin/gh/janeyx99/296/orig 2025-08-14T21:20:15.0185185Z * [new branch] gh/janeyx99/297/base -> origin/gh/janeyx99/297/base 2025-08-14T21:20:15.0186981Z * [new branch] gh/janeyx99/297/head -> origin/gh/janeyx99/297/head 2025-08-14T21:20:15.0188753Z * [new branch] gh/janeyx99/297/orig -> origin/gh/janeyx99/297/orig 2025-08-14T21:20:15.0191260Z * [new branch] gh/janeyx99/298/base -> origin/gh/janeyx99/298/base 2025-08-14T21:20:15.0193068Z * [new branch] gh/janeyx99/298/head -> origin/gh/janeyx99/298/head 2025-08-14T21:20:15.0194812Z * [new branch] gh/janeyx99/298/orig -> origin/gh/janeyx99/298/orig 2025-08-14T21:20:15.0197326Z * [new branch] gh/janeyx99/299/base -> origin/gh/janeyx99/299/base 2025-08-14T21:20:15.0199113Z * [new branch] gh/janeyx99/299/head -> origin/gh/janeyx99/299/head 2025-08-14T21:20:15.0200946Z * [new branch] gh/janeyx99/299/orig -> origin/gh/janeyx99/299/orig 2025-08-14T21:20:15.0203503Z * [new branch] gh/janeyx99/300/base -> origin/gh/janeyx99/300/base 2025-08-14T21:20:15.0205483Z * [new branch] gh/janeyx99/300/head -> origin/gh/janeyx99/300/head 2025-08-14T21:20:15.0207304Z * [new branch] gh/janeyx99/300/orig -> origin/gh/janeyx99/300/orig 2025-08-14T21:20:15.0209907Z * [new branch] gh/janeyx99/88/base -> origin/gh/janeyx99/88/base 2025-08-14T21:20:15.0211692Z * [new branch] gh/janeyx99/88/head -> origin/gh/janeyx99/88/head 2025-08-14T21:20:15.0213441Z * [new branch] gh/janeyx99/88/orig -> origin/gh/janeyx99/88/orig 2025-08-14T21:20:15.0216506Z * [new branch] gh/jansel/360/base -> origin/gh/jansel/360/base 2025-08-14T21:20:15.0218297Z * [new branch] gh/jansel/360/head -> origin/gh/jansel/360/head 2025-08-14T21:20:15.0220806Z * [new branch] gh/jansel/451/base -> origin/gh/jansel/451/base 2025-08-14T21:20:15.0222599Z * [new branch] gh/jansel/451/head -> origin/gh/jansel/451/head 2025-08-14T21:20:15.0224427Z * [new branch] gh/jansel/451/orig -> origin/gh/jansel/451/orig 2025-08-14T21:20:15.0226915Z * [new branch] gh/jansel/462/base -> origin/gh/jansel/462/base 2025-08-14T21:20:15.0228759Z * [new branch] gh/jansel/462/head -> origin/gh/jansel/462/head 2025-08-14T21:20:15.0230513Z * [new branch] gh/jansel/462/orig -> origin/gh/jansel/462/orig 2025-08-14T21:20:15.0232992Z * [new branch] gh/jansel/531/base -> origin/gh/jansel/531/base 2025-08-14T21:20:15.0234687Z * [new branch] gh/jansel/531/head -> origin/gh/jansel/531/head 2025-08-14T21:20:15.0236484Z * [new branch] gh/jansel/531/orig -> origin/gh/jansel/531/orig 2025-08-14T21:20:15.0238883Z * [new branch] gh/jansel/534/base -> origin/gh/jansel/534/base 2025-08-14T21:20:15.0240593Z * [new branch] gh/jansel/534/head -> origin/gh/jansel/534/head 2025-08-14T21:20:15.0242404Z * [new branch] gh/jansel/534/orig -> origin/gh/jansel/534/orig 2025-08-14T21:20:15.0245606Z * [new branch] gh/jbschlosser/226/base -> origin/gh/jbschlosser/226/base 2025-08-14T21:20:15.0247771Z * [new branch] gh/jbschlosser/226/head -> origin/gh/jbschlosser/226/head 2025-08-14T21:20:15.0249570Z * [new branch] gh/jbschlosser/226/orig -> origin/gh/jbschlosser/226/orig 2025-08-14T21:20:15.0252098Z * [new branch] gh/jbschlosser/239/base -> origin/gh/jbschlosser/239/base 2025-08-14T21:20:15.0253958Z * [new branch] gh/jbschlosser/239/head -> origin/gh/jbschlosser/239/head 2025-08-14T21:20:15.0255879Z * [new branch] gh/jbschlosser/239/orig -> origin/gh/jbschlosser/239/orig 2025-08-14T21:20:15.0258218Z * [new branch] gh/jbschlosser/247/base -> origin/gh/jbschlosser/247/base 2025-08-14T21:20:15.0259928Z * [new branch] gh/jbschlosser/247/head -> origin/gh/jbschlosser/247/head 2025-08-14T21:20:15.0261653Z * [new branch] gh/jbschlosser/247/orig -> origin/gh/jbschlosser/247/orig 2025-08-14T21:20:15.0264259Z * [new branch] gh/jbschlosser/248/base -> origin/gh/jbschlosser/248/base 2025-08-14T21:20:15.0266085Z * [new branch] gh/jbschlosser/248/head -> origin/gh/jbschlosser/248/head 2025-08-14T21:20:15.0267824Z * [new branch] gh/jbschlosser/248/orig -> origin/gh/jbschlosser/248/orig 2025-08-14T21:20:15.0270344Z * [new branch] gh/jbschlosser/249/base -> origin/gh/jbschlosser/249/base 2025-08-14T21:20:15.0272940Z * [new branch] gh/jbschlosser/249/head -> origin/gh/jbschlosser/249/head 2025-08-14T21:20:15.0274752Z * [new branch] gh/jbschlosser/249/orig -> origin/gh/jbschlosser/249/orig 2025-08-14T21:20:15.0277352Z * [new branch] gh/jbschlosser/250/base -> origin/gh/jbschlosser/250/base 2025-08-14T21:20:15.0279078Z * [new branch] gh/jbschlosser/250/head -> origin/gh/jbschlosser/250/head 2025-08-14T21:20:15.0280884Z * [new branch] gh/jbschlosser/250/orig -> origin/gh/jbschlosser/250/orig 2025-08-14T21:20:15.0283882Z * [new branch] gh/jiayisunx/57/base -> origin/gh/jiayisunx/57/base 2025-08-14T21:20:15.0285987Z * [new branch] gh/jiayisunx/57/head -> origin/gh/jiayisunx/57/head 2025-08-14T21:20:15.0287812Z * [new branch] gh/jiayisunx/57/orig -> origin/gh/jiayisunx/57/orig 2025-08-14T21:20:15.0290204Z * [new branch] gh/jiayisunx/59/base -> origin/gh/jiayisunx/59/base 2025-08-14T21:20:15.0292023Z * [new branch] gh/jiayisunx/59/head -> origin/gh/jiayisunx/59/head 2025-08-14T21:20:15.0293768Z * [new branch] gh/jiayisunx/59/orig -> origin/gh/jiayisunx/59/orig 2025-08-14T21:20:15.0296653Z * [new branch] gh/jiayisunx/61/base -> origin/gh/jiayisunx/61/base 2025-08-14T21:20:15.0298429Z * [new branch] gh/jiayisunx/61/head -> origin/gh/jiayisunx/61/head 2025-08-14T21:20:15.0300140Z * [new branch] gh/jiayisunx/61/orig -> origin/gh/jiayisunx/61/orig 2025-08-14T21:20:15.0303011Z * [new branch] gh/jiayisunx/63/base -> origin/gh/jiayisunx/63/base 2025-08-14T21:20:15.0304820Z * [new branch] gh/jiayisunx/63/head -> origin/gh/jiayisunx/63/head 2025-08-14T21:20:15.0306693Z * [new branch] gh/jiayisunx/63/orig -> origin/gh/jiayisunx/63/orig 2025-08-14T21:20:15.0309096Z * [new branch] gh/jiayisunx/64/base -> origin/gh/jiayisunx/64/base 2025-08-14T21:20:15.0310886Z * [new branch] gh/jiayisunx/64/head -> origin/gh/jiayisunx/64/head 2025-08-14T21:20:15.0312645Z * [new branch] gh/jiayisunx/64/orig -> origin/gh/jiayisunx/64/orig 2025-08-14T21:20:15.0315558Z * [new branch] gh/jiayisunx/65/base -> origin/gh/jiayisunx/65/base 2025-08-14T21:20:15.0317351Z * [new branch] gh/jiayisunx/65/head -> origin/gh/jiayisunx/65/head 2025-08-14T21:20:15.0319082Z * [new branch] gh/jiayisunx/65/orig -> origin/gh/jiayisunx/65/orig 2025-08-14T21:20:15.0321616Z * [new branch] gh/jiayisunx/66/base -> origin/gh/jiayisunx/66/base 2025-08-14T21:20:15.0323404Z * [new branch] gh/jiayisunx/66/head -> origin/gh/jiayisunx/66/head 2025-08-14T21:20:15.0325298Z * [new branch] gh/jiayisunx/66/orig -> origin/gh/jiayisunx/66/orig 2025-08-14T21:20:15.0327936Z * [new branch] gh/jiayisunx/67/base -> origin/gh/jiayisunx/67/base 2025-08-14T21:20:15.0329799Z * [new branch] gh/jiayisunx/67/head -> origin/gh/jiayisunx/67/head 2025-08-14T21:20:15.0331551Z * [new branch] gh/jiayisunx/67/orig -> origin/gh/jiayisunx/67/orig 2025-08-14T21:20:15.0333995Z * [new branch] gh/jiayisunx/68/base -> origin/gh/jiayisunx/68/base 2025-08-14T21:20:15.0335805Z * [new branch] gh/jiayisunx/68/head -> origin/gh/jiayisunx/68/head 2025-08-14T21:20:15.0337528Z * [new branch] gh/jiayisunx/68/orig -> origin/gh/jiayisunx/68/orig 2025-08-14T21:20:15.0340346Z * [new branch] gh/jjwu@meta.com/1/base -> origin/gh/jjwu@meta.com/1/base 2025-08-14T21:20:15.0342145Z * [new branch] gh/jjwu@meta.com/1/head -> origin/gh/jjwu@meta.com/1/head 2025-08-14T21:20:15.0345245Z * [new branch] gh/justinchuby/111/base -> origin/gh/justinchuby/111/base 2025-08-14T21:20:15.0347276Z * [new branch] gh/justinchuby/111/head -> origin/gh/justinchuby/111/head 2025-08-14T21:20:15.0349146Z * [new branch] gh/justinchuby/111/orig -> origin/gh/justinchuby/111/orig 2025-08-14T21:20:15.0352517Z * [new branch] gh/kurtamohler/32/base -> origin/gh/kurtamohler/32/base 2025-08-14T21:20:15.0354297Z * [new branch] gh/kurtamohler/32/head -> origin/gh/kurtamohler/32/head 2025-08-14T21:20:15.0356072Z * [new branch] gh/kurtamohler/32/orig -> origin/gh/kurtamohler/32/orig 2025-08-14T21:20:15.0358471Z * [new branch] gh/kurtamohler/33/base -> origin/gh/kurtamohler/33/base 2025-08-14T21:20:15.0360220Z * [new branch] gh/kurtamohler/33/head -> origin/gh/kurtamohler/33/head 2025-08-14T21:20:15.0362011Z * [new branch] gh/kurtamohler/33/orig -> origin/gh/kurtamohler/33/orig 2025-08-14T21:20:15.0364608Z * [new branch] gh/kurtamohler/34/base -> origin/gh/kurtamohler/34/base 2025-08-14T21:20:15.0366486Z * [new branch] gh/kurtamohler/34/head -> origin/gh/kurtamohler/34/head 2025-08-14T21:20:15.0368327Z * [new branch] gh/kurtamohler/34/orig -> origin/gh/kurtamohler/34/orig 2025-08-14T21:20:15.0370750Z * [new branch] gh/kurtamohler/40/base -> origin/gh/kurtamohler/40/base 2025-08-14T21:20:15.0372522Z * [new branch] gh/kurtamohler/40/head -> origin/gh/kurtamohler/40/head 2025-08-14T21:20:15.0374271Z * [new branch] gh/kurtamohler/40/orig -> origin/gh/kurtamohler/40/orig 2025-08-14T21:20:15.0376716Z * [new branch] gh/kurtamohler/41/base -> origin/gh/kurtamohler/41/base 2025-08-14T21:20:15.0378469Z * [new branch] gh/kurtamohler/41/head -> origin/gh/kurtamohler/41/head 2025-08-14T21:20:15.0380217Z * [new branch] gh/kurtamohler/41/orig -> origin/gh/kurtamohler/41/orig 2025-08-14T21:20:15.0382638Z * [new branch] gh/kurtamohler/42/base -> origin/gh/kurtamohler/42/base 2025-08-14T21:20:15.0384417Z * [new branch] gh/kurtamohler/42/head -> origin/gh/kurtamohler/42/head 2025-08-14T21:20:15.0386904Z * [new branch] gh/kurtamohler/42/orig -> origin/gh/kurtamohler/42/orig 2025-08-14T21:20:15.0389260Z * [new branch] gh/kurtamohler/43/base -> origin/gh/kurtamohler/43/base 2025-08-14T21:20:15.0391022Z * [new branch] gh/kurtamohler/43/head -> origin/gh/kurtamohler/43/head 2025-08-14T21:20:15.0392848Z * [new branch] gh/kurtamohler/43/orig -> origin/gh/kurtamohler/43/orig 2025-08-14T21:20:15.0395303Z * [new branch] gh/kurtamohler/44/base -> origin/gh/kurtamohler/44/base 2025-08-14T21:20:15.0396987Z * [new branch] gh/kurtamohler/44/head -> origin/gh/kurtamohler/44/head 2025-08-14T21:20:15.0398742Z * [new branch] gh/kurtamohler/44/orig -> origin/gh/kurtamohler/44/orig 2025-08-14T21:20:15.0401292Z * [new branch] gh/kurtamohler/45/base -> origin/gh/kurtamohler/45/base 2025-08-14T21:20:15.0403023Z * [new branch] gh/kurtamohler/45/head -> origin/gh/kurtamohler/45/head 2025-08-14T21:20:15.0404812Z * [new branch] gh/kurtamohler/45/orig -> origin/gh/kurtamohler/45/orig 2025-08-14T21:20:15.0407304Z * [new branch] gh/kurtamohler/46/base -> origin/gh/kurtamohler/46/base 2025-08-14T21:20:15.0409068Z * [new branch] gh/kurtamohler/46/head -> origin/gh/kurtamohler/46/head 2025-08-14T21:20:15.0410871Z * [new branch] gh/kurtamohler/46/orig -> origin/gh/kurtamohler/46/orig 2025-08-14T21:20:15.0413916Z * [new branch] gh/kwen2501/130/base -> origin/gh/kwen2501/130/base 2025-08-14T21:20:15.0415883Z * [new branch] gh/kwen2501/130/head -> origin/gh/kwen2501/130/head 2025-08-14T21:20:15.0417722Z * [new branch] gh/kwen2501/130/orig -> origin/gh/kwen2501/130/orig 2025-08-14T21:20:15.0420276Z * [new branch] gh/kwen2501/142/base -> origin/gh/kwen2501/142/base 2025-08-14T21:20:15.0421981Z * [new branch] gh/kwen2501/142/head -> origin/gh/kwen2501/142/head 2025-08-14T21:20:15.0423822Z * [new branch] gh/kwen2501/142/orig -> origin/gh/kwen2501/142/orig 2025-08-14T21:20:15.0426340Z * [new branch] gh/kwen2501/15/base -> origin/gh/kwen2501/15/base 2025-08-14T21:20:15.0428125Z * [new branch] gh/kwen2501/15/head -> origin/gh/kwen2501/15/head 2025-08-14T21:20:15.0430584Z * [new branch] gh/kwen2501/156/base -> origin/gh/kwen2501/156/base 2025-08-14T21:20:15.0432308Z * [new branch] gh/kwen2501/156/head -> origin/gh/kwen2501/156/head 2025-08-14T21:20:15.0434070Z * [new branch] gh/kwen2501/156/orig -> origin/gh/kwen2501/156/orig 2025-08-14T21:20:15.0436533Z * [new branch] gh/kwen2501/170/base -> origin/gh/kwen2501/170/base 2025-08-14T21:20:15.0438242Z * [new branch] gh/kwen2501/170/head -> origin/gh/kwen2501/170/head 2025-08-14T21:20:15.0440841Z * [new branch] gh/kwen2501/179/base -> origin/gh/kwen2501/179/base 2025-08-14T21:20:15.0442596Z * [new branch] gh/kwen2501/179/head -> origin/gh/kwen2501/179/head 2025-08-14T21:20:15.0444428Z * [new branch] gh/kwen2501/179/orig -> origin/gh/kwen2501/179/orig 2025-08-14T21:20:15.0446992Z * [new branch] gh/kwen2501/181/base -> origin/gh/kwen2501/181/base 2025-08-14T21:20:15.0449017Z * [new branch] gh/kwen2501/181/head -> origin/gh/kwen2501/181/head 2025-08-14T21:20:15.0450727Z * [new branch] gh/kwen2501/181/orig -> origin/gh/kwen2501/181/orig 2025-08-14T21:20:15.0453089Z * [new branch] gh/kwen2501/183/base -> origin/gh/kwen2501/183/base 2025-08-14T21:20:15.0454889Z * [new branch] gh/kwen2501/183/head -> origin/gh/kwen2501/183/head 2025-08-14T21:20:15.0456658Z * [new branch] gh/kwen2501/183/orig -> origin/gh/kwen2501/183/orig 2025-08-14T21:20:15.0459124Z * [new branch] gh/kwen2501/184/base -> origin/gh/kwen2501/184/base 2025-08-14T21:20:15.0461398Z * [new branch] gh/kwen2501/184/head -> origin/gh/kwen2501/184/head 2025-08-14T21:20:15.0463245Z * [new branch] gh/kwen2501/184/orig -> origin/gh/kwen2501/184/orig 2025-08-14T21:20:15.0465851Z * [new branch] gh/kwen2501/186/base -> origin/gh/kwen2501/186/base 2025-08-14T21:20:15.0467651Z * [new branch] gh/kwen2501/186/head -> origin/gh/kwen2501/186/head 2025-08-14T21:20:15.0469381Z * [new branch] gh/kwen2501/186/orig -> origin/gh/kwen2501/186/orig 2025-08-14T21:20:15.0471662Z * [new branch] gh/kwen2501/187/base -> origin/gh/kwen2501/187/base 2025-08-14T21:20:15.0473712Z * [new branch] gh/kwen2501/187/head -> origin/gh/kwen2501/187/head 2025-08-14T21:20:15.0475373Z * [new branch] gh/kwen2501/187/orig -> origin/gh/kwen2501/187/orig 2025-08-14T21:20:15.0477791Z * [new branch] gh/kwen2501/188/base -> origin/gh/kwen2501/188/base 2025-08-14T21:20:15.0479504Z * [new branch] gh/kwen2501/188/head -> origin/gh/kwen2501/188/head 2025-08-14T21:20:15.0481294Z * [new branch] gh/kwen2501/188/orig -> origin/gh/kwen2501/188/orig 2025-08-14T21:20:15.0483794Z * [new branch] gh/kwen2501/194/base -> origin/gh/kwen2501/194/base 2025-08-14T21:20:15.0485708Z * [new branch] gh/kwen2501/194/head -> origin/gh/kwen2501/194/head 2025-08-14T21:20:15.0487495Z * [new branch] gh/kwen2501/194/orig -> origin/gh/kwen2501/194/orig 2025-08-14T21:20:15.0490183Z * [new branch] gh/kwen2501/195/base -> origin/gh/kwen2501/195/base 2025-08-14T21:20:15.0491938Z * [new branch] gh/kwen2501/195/head -> origin/gh/kwen2501/195/head 2025-08-14T21:20:15.0493675Z * [new branch] gh/kwen2501/195/orig -> origin/gh/kwen2501/195/orig 2025-08-14T21:20:15.0496085Z * [new branch] gh/kwen2501/196/base -> origin/gh/kwen2501/196/base 2025-08-14T21:20:15.0497812Z * [new branch] gh/kwen2501/196/head -> origin/gh/kwen2501/196/head 2025-08-14T21:20:15.0499567Z * [new branch] gh/kwen2501/196/orig -> origin/gh/kwen2501/196/orig 2025-08-14T21:20:15.0501998Z * [new branch] gh/kwen2501/197/base -> origin/gh/kwen2501/197/base 2025-08-14T21:20:15.0503736Z * [new branch] gh/kwen2501/197/head -> origin/gh/kwen2501/197/head 2025-08-14T21:20:15.0505504Z * [new branch] gh/kwen2501/197/orig -> origin/gh/kwen2501/197/orig 2025-08-14T21:20:15.0508027Z * [new branch] gh/kwen2501/198/base -> origin/gh/kwen2501/198/base 2025-08-14T21:20:15.0509733Z * [new branch] gh/kwen2501/198/head -> origin/gh/kwen2501/198/head 2025-08-14T21:20:15.0511478Z * [new branch] gh/kwen2501/198/orig -> origin/gh/kwen2501/198/orig 2025-08-14T21:20:15.0514557Z * [new branch] gh/kwen2501/199/base -> origin/gh/kwen2501/199/base 2025-08-14T21:20:15.0516452Z * [new branch] gh/kwen2501/199/head -> origin/gh/kwen2501/199/head 2025-08-14T21:20:15.0518154Z * [new branch] gh/kwen2501/199/orig -> origin/gh/kwen2501/199/orig 2025-08-14T21:20:15.0520653Z * [new branch] gh/kwen2501/200/base -> origin/gh/kwen2501/200/base 2025-08-14T21:20:15.0522396Z * [new branch] gh/kwen2501/200/head -> origin/gh/kwen2501/200/head 2025-08-14T21:20:15.0524175Z * [new branch] gh/kwen2501/200/orig -> origin/gh/kwen2501/200/orig 2025-08-14T21:20:15.0526891Z * [new branch] gh/kwen2501/201/base -> origin/gh/kwen2501/201/base 2025-08-14T21:20:15.0528669Z * [new branch] gh/kwen2501/201/head -> origin/gh/kwen2501/201/head 2025-08-14T21:20:15.0530475Z * [new branch] gh/kwen2501/201/orig -> origin/gh/kwen2501/201/orig 2025-08-14T21:20:15.0533044Z * [new branch] gh/kwen2501/202/base -> origin/gh/kwen2501/202/base 2025-08-14T21:20:15.0534776Z * [new branch] gh/kwen2501/202/head -> origin/gh/kwen2501/202/head 2025-08-14T21:20:15.0536593Z * [new branch] gh/kwen2501/202/orig -> origin/gh/kwen2501/202/orig 2025-08-14T21:20:15.0539237Z * [new branch] gh/kwen2501/203/base -> origin/gh/kwen2501/203/base 2025-08-14T21:20:15.0541042Z * [new branch] gh/kwen2501/203/head -> origin/gh/kwen2501/203/head 2025-08-14T21:20:15.0542753Z * [new branch] gh/kwen2501/203/orig -> origin/gh/kwen2501/203/orig 2025-08-14T21:20:15.0545871Z * [new branch] gh/laithsakka/152/base -> origin/gh/laithsakka/152/base 2025-08-14T21:20:15.0547474Z * [new branch] gh/laithsakka/152/head -> origin/gh/laithsakka/152/head 2025-08-14T21:20:15.0549607Z * [new branch] gh/laithsakka/152/orig -> origin/gh/laithsakka/152/orig 2025-08-14T21:20:15.0552250Z * [new branch] gh/laithsakka/156/base -> origin/gh/laithsakka/156/base 2025-08-14T21:20:15.0553999Z * [new branch] gh/laithsakka/156/head -> origin/gh/laithsakka/156/head 2025-08-14T21:20:15.0555834Z * [new branch] gh/laithsakka/156/orig -> origin/gh/laithsakka/156/orig 2025-08-14T21:20:15.0558301Z * [new branch] gh/laithsakka/159/base -> origin/gh/laithsakka/159/base 2025-08-14T21:20:15.0560062Z * [new branch] gh/laithsakka/159/head -> origin/gh/laithsakka/159/head 2025-08-14T21:20:15.0561929Z * [new branch] gh/laithsakka/159/orig -> origin/gh/laithsakka/159/orig 2025-08-14T21:20:15.0564560Z * [new branch] gh/laithsakka/160/base -> origin/gh/laithsakka/160/base 2025-08-14T21:20:15.0566411Z * [new branch] gh/laithsakka/160/head -> origin/gh/laithsakka/160/head 2025-08-14T21:20:15.0568113Z * [new branch] gh/laithsakka/160/orig -> origin/gh/laithsakka/160/orig 2025-08-14T21:20:15.0570557Z * [new branch] gh/laithsakka/178/base -> origin/gh/laithsakka/178/base 2025-08-14T21:20:15.0572387Z * [new branch] gh/laithsakka/178/head -> origin/gh/laithsakka/178/head 2025-08-14T21:20:15.0574201Z * [new branch] gh/laithsakka/178/orig -> origin/gh/laithsakka/178/orig 2025-08-14T21:20:15.0576637Z * [new branch] gh/laithsakka/191/base -> origin/gh/laithsakka/191/base 2025-08-14T21:20:15.0578402Z * [new branch] gh/laithsakka/191/head -> origin/gh/laithsakka/191/head 2025-08-14T21:20:15.0580155Z * [new branch] gh/laithsakka/191/orig -> origin/gh/laithsakka/191/orig 2025-08-14T21:20:15.0582602Z * [new branch] gh/laithsakka/234/base -> origin/gh/laithsakka/234/base 2025-08-14T21:20:15.0584422Z * [new branch] gh/laithsakka/234/head -> origin/gh/laithsakka/234/head 2025-08-14T21:20:15.0586305Z * [new branch] gh/laithsakka/234/orig -> origin/gh/laithsakka/234/orig 2025-08-14T21:20:15.0588772Z * [new branch] gh/laithsakka/237/base -> origin/gh/laithsakka/237/base 2025-08-14T21:20:15.0590546Z * [new branch] gh/laithsakka/237/head -> origin/gh/laithsakka/237/head 2025-08-14T21:20:15.0592389Z * [new branch] gh/laithsakka/237/orig -> origin/gh/laithsakka/237/orig 2025-08-14T21:20:15.0594783Z * [new branch] gh/laithsakka/238/base -> origin/gh/laithsakka/238/base 2025-08-14T21:20:15.0597062Z * [new branch] gh/laithsakka/238/head -> origin/gh/laithsakka/238/head 2025-08-14T21:20:15.0598859Z * [new branch] gh/laithsakka/238/orig -> origin/gh/laithsakka/238/orig 2025-08-14T21:20:15.0601323Z * [new branch] gh/laithsakka/239/base -> origin/gh/laithsakka/239/base 2025-08-14T21:20:15.0603109Z * [new branch] gh/laithsakka/239/head -> origin/gh/laithsakka/239/head 2025-08-14T21:20:15.0605013Z * [new branch] gh/laithsakka/239/orig -> origin/gh/laithsakka/239/orig 2025-08-14T21:20:15.0607472Z * [new branch] gh/laithsakka/240/base -> origin/gh/laithsakka/240/base 2025-08-14T21:20:15.0615859Z * [new branch] gh/laithsakka/240/head -> origin/gh/laithsakka/240/head 2025-08-14T21:20:15.0616111Z * [new branch] gh/laithsakka/240/orig -> origin/gh/laithsakka/240/orig 2025-08-14T21:20:15.0616340Z * [new branch] gh/laithsakka/242/base -> origin/gh/laithsakka/242/base 2025-08-14T21:20:15.0616554Z * [new branch] gh/laithsakka/242/head -> origin/gh/laithsakka/242/head 2025-08-14T21:20:15.0617533Z * [new branch] gh/laithsakka/242/orig -> origin/gh/laithsakka/242/orig 2025-08-14T21:20:15.0619835Z * [new branch] gh/laithsakka/243/base -> origin/gh/laithsakka/243/base 2025-08-14T21:20:15.0621628Z * [new branch] gh/laithsakka/243/head -> origin/gh/laithsakka/243/head 2025-08-14T21:20:15.0623276Z * [new branch] gh/laithsakka/243/orig -> origin/gh/laithsakka/243/orig 2025-08-14T21:20:15.0625738Z * [new branch] gh/laithsakka/244/base -> origin/gh/laithsakka/244/base 2025-08-14T21:20:15.0627527Z * [new branch] gh/laithsakka/244/head -> origin/gh/laithsakka/244/head 2025-08-14T21:20:15.0629312Z * [new branch] gh/laithsakka/244/orig -> origin/gh/laithsakka/244/orig 2025-08-14T21:20:15.0631823Z * [new branch] gh/laithsakka/245/base -> origin/gh/laithsakka/245/base 2025-08-14T21:20:15.0633540Z * [new branch] gh/laithsakka/245/head -> origin/gh/laithsakka/245/head 2025-08-14T21:20:15.0635388Z * [new branch] gh/laithsakka/245/orig -> origin/gh/laithsakka/245/orig 2025-08-14T21:20:15.0637876Z * [new branch] gh/laithsakka/246/base -> origin/gh/laithsakka/246/base 2025-08-14T21:20:15.0639576Z * [new branch] gh/laithsakka/246/head -> origin/gh/laithsakka/246/head 2025-08-14T21:20:15.0641435Z * [new branch] gh/laithsakka/246/orig -> origin/gh/laithsakka/246/orig 2025-08-14T21:20:15.0644527Z * [new branch] gh/laithsakka/247/base -> origin/gh/laithsakka/247/base 2025-08-14T21:20:15.0646451Z * [new branch] gh/laithsakka/247/head -> origin/gh/laithsakka/247/head 2025-08-14T21:20:15.0651226Z * [new branch] gh/laithsakka/247/orig -> origin/gh/laithsakka/247/orig 2025-08-14T21:20:15.0653685Z * [new branch] gh/laithsakka/248/base -> origin/gh/laithsakka/248/base 2025-08-14T21:20:15.0655468Z * [new branch] gh/laithsakka/248/head -> origin/gh/laithsakka/248/head 2025-08-14T21:20:15.0657271Z * [new branch] gh/laithsakka/248/orig -> origin/gh/laithsakka/248/orig 2025-08-14T21:20:15.0659868Z * [new branch] gh/laithsakka/249/base -> origin/gh/laithsakka/249/base 2025-08-14T21:20:15.0661695Z * [new branch] gh/laithsakka/249/head -> origin/gh/laithsakka/249/head 2025-08-14T21:20:15.0663964Z * [new branch] gh/laithsakka/249/orig -> origin/gh/laithsakka/249/orig 2025-08-14T21:20:15.0666135Z * [new branch] gh/laithsakka/250/base -> origin/gh/laithsakka/250/base 2025-08-14T21:20:15.0668128Z * [new branch] gh/laithsakka/250/head -> origin/gh/laithsakka/250/head 2025-08-14T21:20:15.0669623Z * [new branch] gh/laithsakka/250/orig -> origin/gh/laithsakka/250/orig 2025-08-14T21:20:15.0672089Z * [new branch] gh/laithsakka/251/base -> origin/gh/laithsakka/251/base 2025-08-14T21:20:15.0673909Z * [new branch] gh/laithsakka/251/head -> origin/gh/laithsakka/251/head 2025-08-14T21:20:15.0675632Z * [new branch] gh/laithsakka/251/orig -> origin/gh/laithsakka/251/orig 2025-08-14T21:20:15.0678688Z * [new branch] gh/laithsakka/252/base -> origin/gh/laithsakka/252/base 2025-08-14T21:20:15.0680435Z * [new branch] gh/laithsakka/252/head -> origin/gh/laithsakka/252/head 2025-08-14T21:20:15.0682181Z * [new branch] gh/laithsakka/252/orig -> origin/gh/laithsakka/252/orig 2025-08-14T21:20:15.0684656Z * [new branch] gh/laithsakka/253/base -> origin/gh/laithsakka/253/base 2025-08-14T21:20:15.0686526Z * [new branch] gh/laithsakka/253/head -> origin/gh/laithsakka/253/head 2025-08-14T21:20:15.0688362Z * [new branch] gh/laithsakka/253/orig -> origin/gh/laithsakka/253/orig 2025-08-14T21:20:15.0691002Z * [new branch] gh/laithsakka/254/base -> origin/gh/laithsakka/254/base 2025-08-14T21:20:15.0692661Z * [new branch] gh/laithsakka/254/head -> origin/gh/laithsakka/254/head 2025-08-14T21:20:15.0694405Z * [new branch] gh/laithsakka/254/orig -> origin/gh/laithsakka/254/orig 2025-08-14T21:20:15.0696969Z * [new branch] gh/laithsakka/255/base -> origin/gh/laithsakka/255/base 2025-08-14T21:20:15.0698655Z * [new branch] gh/laithsakka/255/head -> origin/gh/laithsakka/255/head 2025-08-14T21:20:15.0700388Z * [new branch] gh/laithsakka/255/orig -> origin/gh/laithsakka/255/orig 2025-08-14T21:20:15.0702914Z * [new branch] gh/laithsakka/256/base -> origin/gh/laithsakka/256/base 2025-08-14T21:20:15.0707206Z * [new branch] gh/laithsakka/256/head -> origin/gh/laithsakka/256/head 2025-08-14T21:20:15.0708764Z * [new branch] gh/laithsakka/256/orig -> origin/gh/laithsakka/256/orig 2025-08-14T21:20:15.0709316Z * [new branch] gh/laithsakka/257/base -> origin/gh/laithsakka/257/base 2025-08-14T21:20:15.0710922Z * [new branch] gh/laithsakka/257/head -> origin/gh/laithsakka/257/head 2025-08-14T21:20:15.0712619Z * [new branch] gh/laithsakka/257/orig -> origin/gh/laithsakka/257/orig 2025-08-14T21:20:15.0715211Z * [new branch] gh/laithsakka/258/base -> origin/gh/laithsakka/258/base 2025-08-14T21:20:15.0716947Z * [new branch] gh/laithsakka/258/head -> origin/gh/laithsakka/258/head 2025-08-14T21:20:15.0718735Z * [new branch] gh/laithsakka/258/orig -> origin/gh/laithsakka/258/orig 2025-08-14T21:20:15.0721306Z * [new branch] gh/laithsakka/259/base -> origin/gh/laithsakka/259/base 2025-08-14T21:20:15.0723057Z * [new branch] gh/laithsakka/259/head -> origin/gh/laithsakka/259/head 2025-08-14T21:20:15.0724865Z * [new branch] gh/laithsakka/259/orig -> origin/gh/laithsakka/259/orig 2025-08-14T21:20:15.0727511Z * [new branch] gh/laithsakka/260/base -> origin/gh/laithsakka/260/base 2025-08-14T21:20:15.0729227Z * [new branch] gh/laithsakka/260/head -> origin/gh/laithsakka/260/head 2025-08-14T21:20:15.0730991Z * [new branch] gh/laithsakka/260/orig -> origin/gh/laithsakka/260/orig 2025-08-14T21:20:15.0733545Z * [new branch] gh/laithsakka/261/base -> origin/gh/laithsakka/261/base 2025-08-14T21:20:15.0735808Z * [new branch] gh/laithsakka/261/head -> origin/gh/laithsakka/261/head 2025-08-14T21:20:15.0737630Z * [new branch] gh/laithsakka/261/orig -> origin/gh/laithsakka/261/orig 2025-08-14T21:20:15.0739986Z * [new branch] gh/laithsakka/262/base -> origin/gh/laithsakka/262/base 2025-08-14T21:20:15.0741794Z * [new branch] gh/laithsakka/262/head -> origin/gh/laithsakka/262/head 2025-08-14T21:20:15.0743509Z * [new branch] gh/laithsakka/262/orig -> origin/gh/laithsakka/262/orig 2025-08-14T21:20:15.0746333Z * [new branch] gh/laithsakka/28/base -> origin/gh/laithsakka/28/base 2025-08-14T21:20:15.0749030Z * [new branch] gh/laithsakka/29/base -> origin/gh/laithsakka/29/base 2025-08-14T21:20:15.0751380Z * [new branch] gh/laithsakka/30/base -> origin/gh/laithsakka/30/base 2025-08-14T21:20:15.0753219Z * [new branch] gh/laithsakka/30/head -> origin/gh/laithsakka/30/head 2025-08-14T21:20:15.0756193Z * [new branch] gh/laithsakka/31/base -> origin/gh/laithsakka/31/base 2025-08-14T21:20:15.0757927Z * [new branch] gh/laithsakka/31/head -> origin/gh/laithsakka/31/head 2025-08-14T21:20:15.0760368Z * [new branch] gh/laithsakka/32/base -> origin/gh/laithsakka/32/base 2025-08-14T21:20:15.0762066Z * [new branch] gh/laithsakka/32/head -> origin/gh/laithsakka/32/head 2025-08-14T21:20:15.0767416Z * [new branch] gh/lucaskabela/1/base -> origin/gh/lucaskabela/1/base 2025-08-14T21:20:15.0769182Z * [new branch] gh/lucaskabela/1/head -> origin/gh/lucaskabela/1/head 2025-08-14T21:20:15.0771482Z * [new branch] gh/lucaskabela/10/base -> origin/gh/lucaskabela/10/base 2025-08-14T21:20:15.0773274Z * [new branch] gh/lucaskabela/10/head -> origin/gh/lucaskabela/10/head 2025-08-14T21:20:15.0774985Z * [new branch] gh/lucaskabela/10/orig -> origin/gh/lucaskabela/10/orig 2025-08-14T21:20:15.0777335Z * [new branch] gh/lucaskabela/11/base -> origin/gh/lucaskabela/11/base 2025-08-14T21:20:15.0779082Z * [new branch] gh/lucaskabela/11/head -> origin/gh/lucaskabela/11/head 2025-08-14T21:20:15.0780858Z * [new branch] gh/lucaskabela/11/orig -> origin/gh/lucaskabela/11/orig 2025-08-14T21:20:15.0783106Z * [new branch] gh/lucaskabela/12/base -> origin/gh/lucaskabela/12/base 2025-08-14T21:20:15.0785003Z * [new branch] gh/lucaskabela/12/head -> origin/gh/lucaskabela/12/head 2025-08-14T21:20:15.0786746Z * [new branch] gh/lucaskabela/12/orig -> origin/gh/lucaskabela/12/orig 2025-08-14T21:20:15.0789083Z * [new branch] gh/lucaskabela/13/base -> origin/gh/lucaskabela/13/base 2025-08-14T21:20:15.0790810Z * [new branch] gh/lucaskabela/13/head -> origin/gh/lucaskabela/13/head 2025-08-14T21:20:15.0792594Z * [new branch] gh/lucaskabela/13/orig -> origin/gh/lucaskabela/13/orig 2025-08-14T21:20:15.0794922Z * [new branch] gh/lucaskabela/14/base -> origin/gh/lucaskabela/14/base 2025-08-14T21:20:15.0796636Z * [new branch] gh/lucaskabela/14/head -> origin/gh/lucaskabela/14/head 2025-08-14T21:20:15.0798372Z * [new branch] gh/lucaskabela/14/orig -> origin/gh/lucaskabela/14/orig 2025-08-14T21:20:15.0800659Z * [new branch] gh/lucaskabela/15/base -> origin/gh/lucaskabela/15/base 2025-08-14T21:20:15.0802501Z * [new branch] gh/lucaskabela/15/head -> origin/gh/lucaskabela/15/head 2025-08-14T21:20:15.0804219Z * [new branch] gh/lucaskabela/15/orig -> origin/gh/lucaskabela/15/orig 2025-08-14T21:20:15.0806726Z * [new branch] gh/lucaskabela/16/base -> origin/gh/lucaskabela/16/base 2025-08-14T21:20:15.0808585Z * [new branch] gh/lucaskabela/16/head -> origin/gh/lucaskabela/16/head 2025-08-14T21:20:15.0810897Z * [new branch] gh/lucaskabela/16/orig -> origin/gh/lucaskabela/16/orig 2025-08-14T21:20:15.0813178Z * [new branch] gh/lucaskabela/17/base -> origin/gh/lucaskabela/17/base 2025-08-14T21:20:15.0814928Z * [new branch] gh/lucaskabela/17/head -> origin/gh/lucaskabela/17/head 2025-08-14T21:20:15.0816787Z * [new branch] gh/lucaskabela/17/orig -> origin/gh/lucaskabela/17/orig 2025-08-14T21:20:15.0819834Z * [new branch] gh/lucaskabela/2/base -> origin/gh/lucaskabela/2/base 2025-08-14T21:20:15.0821566Z * [new branch] gh/lucaskabela/2/head -> origin/gh/lucaskabela/2/head 2025-08-14T21:20:15.0823461Z * [new branch] gh/lucaskabela/2/orig -> origin/gh/lucaskabela/2/orig 2025-08-14T21:20:15.0826543Z * [new branch] gh/lucaskabela/3/base -> origin/gh/lucaskabela/3/base 2025-08-14T21:20:15.0828302Z * [new branch] gh/lucaskabela/3/head -> origin/gh/lucaskabela/3/head 2025-08-14T21:20:15.0830023Z * [new branch] gh/lucaskabela/3/orig -> origin/gh/lucaskabela/3/orig 2025-08-14T21:20:15.0832883Z * [new branch] gh/lucaskabela/4/base -> origin/gh/lucaskabela/4/base 2025-08-14T21:20:15.0834775Z * [new branch] gh/lucaskabela/4/head -> origin/gh/lucaskabela/4/head 2025-08-14T21:20:15.0836647Z * [new branch] gh/lucaskabela/4/orig -> origin/gh/lucaskabela/4/orig 2025-08-14T21:20:15.0839044Z * [new branch] gh/lucaskabela/5/base -> origin/gh/lucaskabela/5/base 2025-08-14T21:20:15.0840735Z * [new branch] gh/lucaskabela/5/head -> origin/gh/lucaskabela/5/head 2025-08-14T21:20:15.0842505Z * [new branch] gh/lucaskabela/5/orig -> origin/gh/lucaskabela/5/orig 2025-08-14T21:20:15.0844919Z * [new branch] gh/lucaskabela/6/base -> origin/gh/lucaskabela/6/base 2025-08-14T21:20:15.0846842Z * [new branch] gh/lucaskabela/6/head -> origin/gh/lucaskabela/6/head 2025-08-14T21:20:15.0848868Z * [new branch] gh/lucaskabela/6/orig -> origin/gh/lucaskabela/6/orig 2025-08-14T21:20:15.0851397Z * [new branch] gh/lucaskabela/7/base -> origin/gh/lucaskabela/7/base 2025-08-14T21:20:15.0853203Z * [new branch] gh/lucaskabela/7/head -> origin/gh/lucaskabela/7/head 2025-08-14T21:20:15.0854964Z * [new branch] gh/lucaskabela/7/orig -> origin/gh/lucaskabela/7/orig 2025-08-14T21:20:15.0857344Z * [new branch] gh/lucaskabela/8/base -> origin/gh/lucaskabela/8/base 2025-08-14T21:20:15.0859140Z * [new branch] gh/lucaskabela/8/head -> origin/gh/lucaskabela/8/head 2025-08-14T21:20:15.0860900Z * [new branch] gh/lucaskabela/8/orig -> origin/gh/lucaskabela/8/orig 2025-08-14T21:20:15.0863350Z * [new branch] gh/lucaskabela/9/base -> origin/gh/lucaskabela/9/base 2025-08-14T21:20:15.0865224Z * [new branch] gh/lucaskabela/9/head -> origin/gh/lucaskabela/9/head 2025-08-14T21:20:15.0867021Z * [new branch] gh/lucaskabela/9/orig -> origin/gh/lucaskabela/9/orig 2025-08-14T21:20:15.0869969Z * [new branch] gh/lw/1/base -> origin/gh/lw/1/base 2025-08-14T21:20:15.0871770Z * [new branch] gh/lw/1/head -> origin/gh/lw/1/head 2025-08-14T21:20:15.0873559Z * [new branch] gh/lw/1/orig -> origin/gh/lw/1/orig 2025-08-14T21:20:15.0875959Z * [new branch] gh/lw/2/base -> origin/gh/lw/2/base 2025-08-14T21:20:15.0877795Z * [new branch] gh/lw/2/head -> origin/gh/lw/2/head 2025-08-14T21:20:15.0879579Z * [new branch] gh/lw/2/orig -> origin/gh/lw/2/orig 2025-08-14T21:20:15.0882853Z * [new branch] gh/lw/3/base -> origin/gh/lw/3/base 2025-08-14T21:20:15.0884750Z * [new branch] gh/lw/3/head -> origin/gh/lw/3/head 2025-08-14T21:20:15.0886538Z * [new branch] gh/lw/3/orig -> origin/gh/lw/3/orig 2025-08-14T21:20:15.0889743Z * [new branch] gh/malfet/14/base -> origin/gh/malfet/14/base 2025-08-14T21:20:15.0892261Z * [new branch] gh/malfet/330/base -> origin/gh/malfet/330/base 2025-08-14T21:20:15.0893992Z * [new branch] gh/malfet/330/head -> origin/gh/malfet/330/head 2025-08-14T21:20:15.0895686Z * [new branch] gh/malfet/330/orig -> origin/gh/malfet/330/orig 2025-08-14T21:20:15.0898234Z * [new branch] gh/malfet/396/base -> origin/gh/malfet/396/base 2025-08-14T21:20:15.0899991Z * [new branch] gh/malfet/396/head -> origin/gh/malfet/396/head 2025-08-14T21:20:15.0901761Z * [new branch] gh/malfet/396/orig -> origin/gh/malfet/396/orig 2025-08-14T21:20:15.0904259Z * [new branch] gh/malfet/397/base -> origin/gh/malfet/397/base 2025-08-14T21:20:15.0906216Z * [new branch] gh/malfet/397/head -> origin/gh/malfet/397/head 2025-08-14T21:20:15.0907977Z * [new branch] gh/malfet/397/orig -> origin/gh/malfet/397/orig 2025-08-14T21:20:15.0910484Z * [new branch] gh/malfet/398/base -> origin/gh/malfet/398/base 2025-08-14T21:20:15.0912228Z * [new branch] gh/malfet/398/head -> origin/gh/malfet/398/head 2025-08-14T21:20:15.0913806Z * [new branch] gh/malfet/398/orig -> origin/gh/malfet/398/orig 2025-08-14T21:20:15.0916268Z * [new branch] gh/malfet/399/base -> origin/gh/malfet/399/base 2025-08-14T21:20:15.0918110Z * [new branch] gh/malfet/399/head -> origin/gh/malfet/399/head 2025-08-14T21:20:15.0919790Z * [new branch] gh/malfet/399/orig -> origin/gh/malfet/399/orig 2025-08-14T21:20:15.0922350Z * [new branch] gh/malfet/414/base -> origin/gh/malfet/414/base 2025-08-14T21:20:15.0924085Z * [new branch] gh/malfet/414/head -> origin/gh/malfet/414/head 2025-08-14T21:20:15.0926205Z * [new branch] gh/malfet/414/orig -> origin/gh/malfet/414/orig 2025-08-14T21:20:15.0928631Z * [new branch] gh/malfet/417/base -> origin/gh/malfet/417/base 2025-08-14T21:20:15.0930399Z * [new branch] gh/malfet/417/head -> origin/gh/malfet/417/head 2025-08-14T21:20:15.0932144Z * [new branch] gh/malfet/417/orig -> origin/gh/malfet/417/orig 2025-08-14T21:20:15.0934656Z * [new branch] gh/malfet/418/base -> origin/gh/malfet/418/base 2025-08-14T21:20:15.0936766Z * [new branch] gh/malfet/418/head -> origin/gh/malfet/418/head 2025-08-14T21:20:15.0938479Z * [new branch] gh/malfet/418/orig -> origin/gh/malfet/418/orig 2025-08-14T21:20:15.0940900Z * [new branch] gh/malfet/422/base -> origin/gh/malfet/422/base 2025-08-14T21:20:15.0942705Z * [new branch] gh/malfet/422/head -> origin/gh/malfet/422/head 2025-08-14T21:20:15.0944582Z * [new branch] gh/malfet/422/orig -> origin/gh/malfet/422/orig 2025-08-14T21:20:15.0947203Z * [new branch] gh/malfet/438/base -> origin/gh/malfet/438/base 2025-08-14T21:20:15.0953177Z * [new branch] gh/malfet/438/head -> origin/gh/malfet/438/head 2025-08-14T21:20:15.0954737Z * [new branch] gh/malfet/438/orig -> origin/gh/malfet/438/orig 2025-08-14T21:20:15.0957822Z * [new branch] gh/malfet/439/base -> origin/gh/malfet/439/base 2025-08-14T21:20:15.0959703Z * [new branch] gh/malfet/439/head -> origin/gh/malfet/439/head 2025-08-14T21:20:15.0961429Z * [new branch] gh/malfet/439/orig -> origin/gh/malfet/439/orig 2025-08-14T21:20:15.0964616Z * [new branch] gh/malfet/440/base -> origin/gh/malfet/440/base 2025-08-14T21:20:15.0966569Z * [new branch] gh/malfet/440/head -> origin/gh/malfet/440/head 2025-08-14T21:20:15.0968418Z * [new branch] gh/malfet/440/orig -> origin/gh/malfet/440/orig 2025-08-14T21:20:15.0970923Z * [new branch] gh/malfet/441/base -> origin/gh/malfet/441/base 2025-08-14T21:20:15.0972739Z * [new branch] gh/malfet/441/head -> origin/gh/malfet/441/head 2025-08-14T21:20:15.0974631Z * [new branch] gh/malfet/441/orig -> origin/gh/malfet/441/orig 2025-08-14T21:20:15.0977222Z * [new branch] gh/malfet/442/base -> origin/gh/malfet/442/base 2025-08-14T21:20:15.0978970Z * [new branch] gh/malfet/442/head -> origin/gh/malfet/442/head 2025-08-14T21:20:15.0980742Z * [new branch] gh/malfet/442/orig -> origin/gh/malfet/442/orig 2025-08-14T21:20:15.0983164Z * [new branch] gh/malfet/443/base -> origin/gh/malfet/443/base 2025-08-14T21:20:15.0985094Z * [new branch] gh/malfet/443/head -> origin/gh/malfet/443/head 2025-08-14T21:20:15.0986852Z * [new branch] gh/malfet/443/orig -> origin/gh/malfet/443/orig 2025-08-14T21:20:15.0989579Z * [new branch] gh/malfet/444/base -> origin/gh/malfet/444/base 2025-08-14T21:20:15.0991176Z * [new branch] gh/malfet/444/head -> origin/gh/malfet/444/head 2025-08-14T21:20:15.0993058Z * [new branch] gh/malfet/444/orig -> origin/gh/malfet/444/orig 2025-08-14T21:20:15.0996146Z * [new branch] gh/malfet/445/base -> origin/gh/malfet/445/base 2025-08-14T21:20:15.0997945Z * [new branch] gh/malfet/445/head -> origin/gh/malfet/445/head 2025-08-14T21:20:15.0999836Z * [new branch] gh/malfet/445/orig -> origin/gh/malfet/445/orig 2025-08-14T21:20:15.1002757Z * [new branch] gh/malfet/446/base -> origin/gh/malfet/446/base 2025-08-14T21:20:15.1004721Z * [new branch] gh/malfet/446/head -> origin/gh/malfet/446/head 2025-08-14T21:20:15.1006604Z * [new branch] gh/malfet/446/orig -> origin/gh/malfet/446/orig 2025-08-14T21:20:15.1009210Z * [new branch] gh/malfet/447/base -> origin/gh/malfet/447/base 2025-08-14T21:20:15.1011090Z * [new branch] gh/malfet/447/head -> origin/gh/malfet/447/head 2025-08-14T21:20:15.1013783Z * [new branch] gh/malfet/448/base -> origin/gh/malfet/448/base 2025-08-14T21:20:15.1015547Z * [new branch] gh/malfet/448/head -> origin/gh/malfet/448/head 2025-08-14T21:20:15.1018013Z * [new branch] gh/malfet/449/base -> origin/gh/malfet/449/base 2025-08-14T21:20:15.1019713Z * [new branch] gh/malfet/449/head -> origin/gh/malfet/449/head 2025-08-14T21:20:15.1022312Z * [new branch] gh/malfet/450/base -> origin/gh/malfet/450/base 2025-08-14T21:20:15.1024047Z * [new branch] gh/malfet/450/head -> origin/gh/malfet/450/head 2025-08-14T21:20:15.1027222Z * [new branch] gh/malfet/451/base -> origin/gh/malfet/451/base 2025-08-14T21:20:15.1028172Z * [new branch] gh/malfet/451/head -> origin/gh/malfet/451/head 2025-08-14T21:20:15.1030942Z * [new branch] gh/malfet/452/base -> origin/gh/malfet/452/base 2025-08-14T21:20:15.1032660Z * [new branch] gh/malfet/452/head -> origin/gh/malfet/452/head 2025-08-14T21:20:15.1034451Z * [new branch] gh/malfet/452/orig -> origin/gh/malfet/452/orig 2025-08-14T21:20:15.1036991Z * [new branch] gh/malfet/453/base -> origin/gh/malfet/453/base 2025-08-14T21:20:15.1038706Z * [new branch] gh/malfet/453/head -> origin/gh/malfet/453/head 2025-08-14T21:20:15.1040508Z * [new branch] gh/malfet/453/orig -> origin/gh/malfet/453/orig 2025-08-14T21:20:15.1043009Z * [new branch] gh/malfet/454/base -> origin/gh/malfet/454/base 2025-08-14T21:20:15.1044761Z * [new branch] gh/malfet/454/head -> origin/gh/malfet/454/head 2025-08-14T21:20:15.1046661Z * [new branch] gh/malfet/454/orig -> origin/gh/malfet/454/orig 2025-08-14T21:20:15.1049367Z * [new branch] gh/malfet/455/base -> origin/gh/malfet/455/base 2025-08-14T21:20:15.1051101Z * [new branch] gh/malfet/455/head -> origin/gh/malfet/455/head 2025-08-14T21:20:15.1052999Z * [new branch] gh/malfet/455/orig -> origin/gh/malfet/455/orig 2025-08-14T21:20:15.1055630Z * [new branch] gh/malfet/456/base -> origin/gh/malfet/456/base 2025-08-14T21:20:15.1057322Z * [new branch] gh/malfet/456/head -> origin/gh/malfet/456/head 2025-08-14T21:20:15.1059182Z * [new branch] gh/malfet/456/orig -> origin/gh/malfet/456/orig 2025-08-14T21:20:15.1061814Z * [new branch] gh/malfet/457/base -> origin/gh/malfet/457/base 2025-08-14T21:20:15.1063521Z * [new branch] gh/malfet/457/head -> origin/gh/malfet/457/head 2025-08-14T21:20:15.1065419Z * [new branch] gh/malfet/457/orig -> origin/gh/malfet/457/orig 2025-08-14T21:20:15.1067953Z * [new branch] gh/malfet/458/base -> origin/gh/malfet/458/base 2025-08-14T21:20:15.1069677Z * [new branch] gh/malfet/458/head -> origin/gh/malfet/458/head 2025-08-14T21:20:15.1071560Z * [new branch] gh/malfet/458/orig -> origin/gh/malfet/458/orig 2025-08-14T21:20:15.1074020Z * [new branch] gh/malfet/459/base -> origin/gh/malfet/459/base 2025-08-14T21:20:15.1075709Z * [new branch] gh/malfet/459/head -> origin/gh/malfet/459/head 2025-08-14T21:20:15.1077520Z * [new branch] gh/malfet/459/orig -> origin/gh/malfet/459/orig 2025-08-14T21:20:15.1080044Z * [new branch] gh/malfet/460/base -> origin/gh/malfet/460/base 2025-08-14T21:20:15.1081882Z * [new branch] gh/malfet/460/head -> origin/gh/malfet/460/head 2025-08-14T21:20:15.1083813Z * [new branch] gh/malfet/460/orig -> origin/gh/malfet/460/orig 2025-08-14T21:20:15.1086538Z * [new branch] gh/malfet/461/base -> origin/gh/malfet/461/base 2025-08-14T21:20:15.1088778Z * [new branch] gh/malfet/461/head -> origin/gh/malfet/461/head 2025-08-14T21:20:15.1090557Z * [new branch] gh/malfet/461/orig -> origin/gh/malfet/461/orig 2025-08-14T21:20:15.1093098Z * [new branch] gh/malfet/462/base -> origin/gh/malfet/462/base 2025-08-14T21:20:15.1094848Z * [new branch] gh/malfet/462/head -> origin/gh/malfet/462/head 2025-08-14T21:20:15.1096593Z * [new branch] gh/malfet/462/orig -> origin/gh/malfet/462/orig 2025-08-14T21:20:15.1099120Z * [new branch] gh/malfet/463/base -> origin/gh/malfet/463/base 2025-08-14T21:20:15.1100918Z * [new branch] gh/malfet/463/head -> origin/gh/malfet/463/head 2025-08-14T21:20:15.1102738Z * [new branch] gh/malfet/463/orig -> origin/gh/malfet/463/orig 2025-08-14T21:20:15.1105231Z * [new branch] gh/malfet/464/base -> origin/gh/malfet/464/base 2025-08-14T21:20:15.1106979Z * [new branch] gh/malfet/464/head -> origin/gh/malfet/464/head 2025-08-14T21:20:15.1108838Z * [new branch] gh/malfet/464/orig -> origin/gh/malfet/464/orig 2025-08-14T21:20:15.1111287Z * [new branch] gh/malfet/465/base -> origin/gh/malfet/465/base 2025-08-14T21:20:15.1113060Z * [new branch] gh/malfet/465/head -> origin/gh/malfet/465/head 2025-08-14T21:20:15.1114833Z * [new branch] gh/malfet/465/orig -> origin/gh/malfet/465/orig 2025-08-14T21:20:15.1117381Z * [new branch] gh/malfet/466/base -> origin/gh/malfet/466/base 2025-08-14T21:20:15.1119134Z * [new branch] gh/malfet/466/head -> origin/gh/malfet/466/head 2025-08-14T21:20:15.1120954Z * [new branch] gh/malfet/466/orig -> origin/gh/malfet/466/orig 2025-08-14T21:20:15.1123448Z * [new branch] gh/malfet/467/base -> origin/gh/malfet/467/base 2025-08-14T21:20:15.1125380Z * [new branch] gh/malfet/467/head -> origin/gh/malfet/467/head 2025-08-14T21:20:15.1127139Z * [new branch] gh/malfet/467/orig -> origin/gh/malfet/467/orig 2025-08-14T21:20:15.1129642Z * [new branch] gh/malfet/468/base -> origin/gh/malfet/468/base 2025-08-14T21:20:15.1131462Z * [new branch] gh/malfet/468/head -> origin/gh/malfet/468/head 2025-08-14T21:20:15.1133394Z * [new branch] gh/malfet/468/orig -> origin/gh/malfet/468/orig 2025-08-14T21:20:15.1135844Z * [new branch] gh/malfet/469/base -> origin/gh/malfet/469/base 2025-08-14T21:20:15.1137581Z * [new branch] gh/malfet/469/head -> origin/gh/malfet/469/head 2025-08-14T21:20:15.1139554Z * [new branch] gh/malfet/469/orig -> origin/gh/malfet/469/orig 2025-08-14T21:20:15.1142003Z * [new branch] gh/malfet/470/base -> origin/gh/malfet/470/base 2025-08-14T21:20:15.1143727Z * [new branch] gh/malfet/470/head -> origin/gh/malfet/470/head 2025-08-14T21:20:15.1145538Z * [new branch] gh/malfet/470/orig -> origin/gh/malfet/470/orig 2025-08-14T21:20:15.1148470Z * [new branch] gh/malfet/471/base -> origin/gh/malfet/471/base 2025-08-14T21:20:15.1150714Z * [new branch] gh/malfet/471/head -> origin/gh/malfet/471/head 2025-08-14T21:20:15.1152485Z * [new branch] gh/malfet/471/orig -> origin/gh/malfet/471/orig 2025-08-14T21:20:15.1155049Z * [new branch] gh/malfet/472/base -> origin/gh/malfet/472/base 2025-08-14T21:20:15.1156822Z * [new branch] gh/malfet/472/head -> origin/gh/malfet/472/head 2025-08-14T21:20:15.1158741Z * [new branch] gh/malfet/472/orig -> origin/gh/malfet/472/orig 2025-08-14T21:20:15.1161190Z * [new branch] gh/malfet/473/base -> origin/gh/malfet/473/base 2025-08-14T21:20:15.1162930Z * [new branch] gh/malfet/473/head -> origin/gh/malfet/473/head 2025-08-14T21:20:15.1164809Z * [new branch] gh/malfet/473/orig -> origin/gh/malfet/473/orig 2025-08-14T21:20:15.1167354Z * [new branch] gh/malfet/474/base -> origin/gh/malfet/474/base 2025-08-14T21:20:15.1169039Z * [new branch] gh/malfet/474/head -> origin/gh/malfet/474/head 2025-08-14T21:20:15.1170787Z * [new branch] gh/malfet/474/orig -> origin/gh/malfet/474/orig 2025-08-14T21:20:15.1173440Z * [new branch] gh/malfet/475/base -> origin/gh/malfet/475/base 2025-08-14T21:20:15.1175191Z * [new branch] gh/malfet/475/head -> origin/gh/malfet/475/head 2025-08-14T21:20:15.1176963Z * [new branch] gh/malfet/475/orig -> origin/gh/malfet/475/orig 2025-08-14T21:20:15.1179447Z * [new branch] gh/malfet/476/base -> origin/gh/malfet/476/base 2025-08-14T21:20:15.1181221Z * [new branch] gh/malfet/476/head -> origin/gh/malfet/476/head 2025-08-14T21:20:15.1183125Z * [new branch] gh/malfet/476/orig -> origin/gh/malfet/476/orig 2025-08-14T21:20:15.1185517Z * [new branch] gh/malfet/477/base -> origin/gh/malfet/477/base 2025-08-14T21:20:15.1187290Z * [new branch] gh/malfet/477/head -> origin/gh/malfet/477/head 2025-08-14T21:20:15.1189159Z * [new branch] gh/malfet/477/orig -> origin/gh/malfet/477/orig 2025-08-14T21:20:15.1191643Z * [new branch] gh/malfet/478/base -> origin/gh/malfet/478/base 2025-08-14T21:20:15.1193433Z * [new branch] gh/malfet/478/head -> origin/gh/malfet/478/head 2025-08-14T21:20:15.1195198Z * [new branch] gh/malfet/478/orig -> origin/gh/malfet/478/orig 2025-08-14T21:20:15.1197676Z * [new branch] gh/malfet/479/base -> origin/gh/malfet/479/base 2025-08-14T21:20:15.1199443Z * [new branch] gh/malfet/479/head -> origin/gh/malfet/479/head 2025-08-14T21:20:15.1201269Z * [new branch] gh/malfet/479/orig -> origin/gh/malfet/479/orig 2025-08-14T21:20:15.1203774Z * [new branch] gh/malfet/480/base -> origin/gh/malfet/480/base 2025-08-14T21:20:15.1205776Z * [new branch] gh/malfet/480/head -> origin/gh/malfet/480/head 2025-08-14T21:20:15.1207616Z * [new branch] gh/malfet/480/orig -> origin/gh/malfet/480/orig 2025-08-14T21:20:15.1210152Z * [new branch] gh/malfet/481/base -> origin/gh/malfet/481/base 2025-08-14T21:20:15.1212596Z * [new branch] gh/malfet/481/head -> origin/gh/malfet/481/head 2025-08-14T21:20:15.1214289Z * [new branch] gh/malfet/481/orig -> origin/gh/malfet/481/orig 2025-08-14T21:20:15.1216672Z * [new branch] gh/malfet/482/base -> origin/gh/malfet/482/base 2025-08-14T21:20:15.1218402Z * [new branch] gh/malfet/482/head -> origin/gh/malfet/482/head 2025-08-14T21:20:15.1220136Z * [new branch] gh/malfet/482/orig -> origin/gh/malfet/482/orig 2025-08-14T21:20:15.1222732Z * [new branch] gh/malfet/483/base -> origin/gh/malfet/483/base 2025-08-14T21:20:15.1224564Z * [new branch] gh/malfet/483/head -> origin/gh/malfet/483/head 2025-08-14T21:20:15.1226393Z * [new branch] gh/malfet/483/orig -> origin/gh/malfet/483/orig 2025-08-14T21:20:15.1228908Z * [new branch] gh/malfet/484/base -> origin/gh/malfet/484/base 2025-08-14T21:20:15.1230710Z * [new branch] gh/malfet/484/head -> origin/gh/malfet/484/head 2025-08-14T21:20:15.1232554Z * [new branch] gh/malfet/484/orig -> origin/gh/malfet/484/orig 2025-08-14T21:20:15.1235064Z * [new branch] gh/malfet/485/base -> origin/gh/malfet/485/base 2025-08-14T21:20:15.1236879Z * [new branch] gh/malfet/485/head -> origin/gh/malfet/485/head 2025-08-14T21:20:15.1238681Z * [new branch] gh/malfet/485/orig -> origin/gh/malfet/485/orig 2025-08-14T21:20:15.1241214Z * [new branch] gh/malfet/486/base -> origin/gh/malfet/486/base 2025-08-14T21:20:15.1242932Z * [new branch] gh/malfet/486/head -> origin/gh/malfet/486/head 2025-08-14T21:20:15.1244797Z * [new branch] gh/malfet/486/orig -> origin/gh/malfet/486/orig 2025-08-14T21:20:15.1247459Z * [new branch] gh/malfet/487/base -> origin/gh/malfet/487/base 2025-08-14T21:20:15.1252316Z * [new branch] gh/malfet/487/head -> origin/gh/malfet/487/head 2025-08-14T21:20:15.1254126Z * [new branch] gh/malfet/487/orig -> origin/gh/malfet/487/orig 2025-08-14T21:20:15.1256613Z * [new branch] gh/malfet/488/base -> origin/gh/malfet/488/base 2025-08-14T21:20:15.1258396Z * [new branch] gh/malfet/488/head -> origin/gh/malfet/488/head 2025-08-14T21:20:15.1260225Z * [new branch] gh/malfet/488/orig -> origin/gh/malfet/488/orig 2025-08-14T21:20:15.1263078Z * [new branch] gh/malfet/489/base -> origin/gh/malfet/489/base 2025-08-14T21:20:15.1264838Z * [new branch] gh/malfet/489/head -> origin/gh/malfet/489/head 2025-08-14T21:20:15.1266789Z * [new branch] gh/malfet/489/orig -> origin/gh/malfet/489/orig 2025-08-14T21:20:15.1269315Z * [new branch] gh/malfet/490/base -> origin/gh/malfet/490/base 2025-08-14T21:20:15.1271307Z * [new branch] gh/malfet/490/head -> origin/gh/malfet/490/head 2025-08-14T21:20:15.1273139Z * [new branch] gh/malfet/490/orig -> origin/gh/malfet/490/orig 2025-08-14T21:20:15.1275740Z * [new branch] gh/malfet/64/base -> origin/gh/malfet/64/base 2025-08-14T21:20:15.1277291Z * [new branch] gh/malfet/64/head -> origin/gh/malfet/64/head 2025-08-14T21:20:15.1280355Z * [new branch] gh/manuelcandales/10/base -> origin/gh/manuelcandales/10/base 2025-08-14T21:20:15.1282109Z * [new branch] gh/manuelcandales/10/head -> origin/gh/manuelcandales/10/head 2025-08-14T21:20:15.1283967Z * [new branch] gh/manuelcandales/10/orig -> origin/gh/manuelcandales/10/orig 2025-08-14T21:20:15.1286601Z * [new branch] gh/manuelcandales/9/base -> origin/gh/manuelcandales/9/base 2025-08-14T21:20:15.1288368Z * [new branch] gh/manuelcandales/9/head -> origin/gh/manuelcandales/9/head 2025-08-14T21:20:15.1290233Z * [new branch] gh/manuelcandales/9/orig -> origin/gh/manuelcandales/9/orig 2025-08-14T21:20:15.1293224Z * [new branch] gh/markkm/1/base -> origin/gh/markkm/1/base 2025-08-14T21:20:15.1296928Z * [new branch] gh/masnesral/204/base -> origin/gh/masnesral/204/base 2025-08-14T21:20:15.1298838Z * [new branch] gh/masnesral/204/head -> origin/gh/masnesral/204/head 2025-08-14T21:20:15.1300722Z * [new branch] gh/masnesral/204/orig -> origin/gh/masnesral/204/orig 2025-08-14T21:20:15.1303237Z * [new branch] gh/masnesral/223/base -> origin/gh/masnesral/223/base 2025-08-14T21:20:15.1304995Z * [new branch] gh/masnesral/223/head -> origin/gh/masnesral/223/head 2025-08-14T21:20:15.1306954Z * [new branch] gh/masnesral/223/orig -> origin/gh/masnesral/223/orig 2025-08-14T21:20:15.1309342Z * [new branch] gh/masnesral/224/base -> origin/gh/masnesral/224/base 2025-08-14T21:20:15.1311129Z * [new branch] gh/masnesral/224/head -> origin/gh/masnesral/224/head 2025-08-14T21:20:15.1312891Z * [new branch] gh/masnesral/224/orig -> origin/gh/masnesral/224/orig 2025-08-14T21:20:15.1315274Z * [new branch] gh/masnesral/225/base -> origin/gh/masnesral/225/base 2025-08-14T21:20:15.1317130Z * [new branch] gh/masnesral/225/head -> origin/gh/masnesral/225/head 2025-08-14T21:20:15.1319453Z * [new branch] gh/masnesral/225/orig -> origin/gh/masnesral/225/orig 2025-08-14T21:20:15.1321909Z * [new branch] gh/masnesral/226/base -> origin/gh/masnesral/226/base 2025-08-14T21:20:15.1323835Z * [new branch] gh/masnesral/226/head -> origin/gh/masnesral/226/head 2025-08-14T21:20:15.1325819Z * [new branch] gh/masnesral/226/orig -> origin/gh/masnesral/226/orig 2025-08-14T21:20:15.1328342Z * [new branch] gh/masnesral/227/base -> origin/gh/masnesral/227/base 2025-08-14T21:20:15.1330202Z * [new branch] gh/masnesral/227/head -> origin/gh/masnesral/227/head 2025-08-14T21:20:15.1332144Z * [new branch] gh/masnesral/227/orig -> origin/gh/masnesral/227/orig 2025-08-14T21:20:15.1335211Z * [new branch] gh/masnesral/228/base -> origin/gh/masnesral/228/base 2025-08-14T21:20:15.1337066Z * [new branch] gh/masnesral/228/head -> origin/gh/masnesral/228/head 2025-08-14T21:20:15.1338875Z * [new branch] gh/masnesral/228/orig -> origin/gh/masnesral/228/orig 2025-08-14T21:20:15.1341344Z * [new branch] gh/masnesral/229/base -> origin/gh/masnesral/229/base 2025-08-14T21:20:15.1343226Z * [new branch] gh/masnesral/229/head -> origin/gh/masnesral/229/head 2025-08-14T21:20:15.1345076Z * [new branch] gh/masnesral/229/orig -> origin/gh/masnesral/229/orig 2025-08-14T21:20:15.1347537Z * [new branch] gh/masnesral/230/base -> origin/gh/masnesral/230/base 2025-08-14T21:20:15.1349481Z * [new branch] gh/masnesral/230/head -> origin/gh/masnesral/230/head 2025-08-14T21:20:15.1351336Z * [new branch] gh/masnesral/230/orig -> origin/gh/masnesral/230/orig 2025-08-14T21:20:15.1353816Z * [new branch] gh/masnesral/231/base -> origin/gh/masnesral/231/base 2025-08-14T21:20:15.1355773Z * [new branch] gh/masnesral/231/head -> origin/gh/masnesral/231/head 2025-08-14T21:20:15.1357669Z * [new branch] gh/masnesral/231/orig -> origin/gh/masnesral/231/orig 2025-08-14T21:20:15.1360221Z * [new branch] gh/masnesral/232/base -> origin/gh/masnesral/232/base 2025-08-14T21:20:15.1362059Z * [new branch] gh/masnesral/232/head -> origin/gh/masnesral/232/head 2025-08-14T21:20:15.1364008Z * [new branch] gh/masnesral/232/orig -> origin/gh/masnesral/232/orig 2025-08-14T21:20:15.1366592Z * [new branch] gh/masnesral/233/base -> origin/gh/masnesral/233/base 2025-08-14T21:20:15.1368373Z * [new branch] gh/masnesral/233/head -> origin/gh/masnesral/233/head 2025-08-14T21:20:15.1370144Z * [new branch] gh/masnesral/233/orig -> origin/gh/masnesral/233/orig 2025-08-14T21:20:15.1372762Z * [new branch] gh/masnesral/234/base -> origin/gh/masnesral/234/base 2025-08-14T21:20:15.1374569Z * [new branch] gh/masnesral/234/head -> origin/gh/masnesral/234/head 2025-08-14T21:20:15.1376384Z * [new branch] gh/masnesral/234/orig -> origin/gh/masnesral/234/orig 2025-08-14T21:20:15.1378880Z * [new branch] gh/masnesral/235/base -> origin/gh/masnesral/235/base 2025-08-14T21:20:15.1380659Z * [new branch] gh/masnesral/235/head -> origin/gh/masnesral/235/head 2025-08-14T21:20:15.1382588Z * [new branch] gh/masnesral/235/orig -> origin/gh/masnesral/235/orig 2025-08-14T21:20:15.1385129Z * [new branch] gh/masnesral/236/base -> origin/gh/masnesral/236/base 2025-08-14T21:20:15.1386962Z * [new branch] gh/masnesral/236/head -> origin/gh/masnesral/236/head 2025-08-14T21:20:15.1388740Z * [new branch] gh/masnesral/236/orig -> origin/gh/masnesral/236/orig 2025-08-14T21:20:15.1391215Z * [new branch] gh/masnesral/34/base -> origin/gh/masnesral/34/base 2025-08-14T21:20:15.1394785Z * [new branch] gh/mhorowitz/0/base -> origin/gh/mhorowitz/0/base 2025-08-14T21:20:15.1396754Z * [new branch] gh/mhorowitz/0/head -> origin/gh/mhorowitz/0/head 2025-08-14T21:20:15.1399071Z * [new branch] gh/mhorowitz/1/base -> origin/gh/mhorowitz/1/base 2025-08-14T21:20:15.1400870Z * [new branch] gh/mhorowitz/1/head -> origin/gh/mhorowitz/1/head 2025-08-14T21:20:15.1403225Z * [new branch] gh/mhorowitz/2/base -> origin/gh/mhorowitz/2/base 2025-08-14T21:20:15.1405306Z * [new branch] gh/mhorowitz/2/head -> origin/gh/mhorowitz/2/head 2025-08-14T21:20:15.1407568Z * [new branch] gh/mhorowitz/3/base -> origin/gh/mhorowitz/3/base 2025-08-14T21:20:15.1409280Z * [new branch] gh/mhorowitz/3/head -> origin/gh/mhorowitz/3/head 2025-08-14T21:20:15.1411596Z * [new branch] gh/mhorowitz/4/base -> origin/gh/mhorowitz/4/base 2025-08-14T21:20:15.1413460Z * [new branch] gh/mhorowitz/4/head -> origin/gh/mhorowitz/4/head 2025-08-14T21:20:15.1416474Z * [new branch] gh/mhorowitz/5/base -> origin/gh/mhorowitz/5/base 2025-08-14T21:20:15.1418182Z * [new branch] gh/mhorowitz/5/head -> origin/gh/mhorowitz/5/head 2025-08-14T21:20:15.1420532Z * [new branch] gh/mhorowitz/6/base -> origin/gh/mhorowitz/6/base 2025-08-14T21:20:15.1422163Z * [new branch] gh/mhorowitz/6/head -> origin/gh/mhorowitz/6/head 2025-08-14T21:20:15.1425340Z * [new branch] gh/mikaylagawarecki/234/base -> origin/gh/mikaylagawarecki/234/base 2025-08-14T21:20:15.1427332Z * [new branch] gh/mikaylagawarecki/234/head -> origin/gh/mikaylagawarecki/234/head 2025-08-14T21:20:15.1429759Z * [new branch] gh/mikaylagawarecki/235/base -> origin/gh/mikaylagawarecki/235/base 2025-08-14T21:20:15.1431493Z * [new branch] gh/mikaylagawarecki/235/head -> origin/gh/mikaylagawarecki/235/head 2025-08-14T21:20:15.1434359Z * [new branch] gh/mikaylagawarecki/236/base -> origin/gh/mikaylagawarecki/236/base 2025-08-14T21:20:15.1436214Z * [new branch] gh/mikaylagawarecki/236/head -> origin/gh/mikaylagawarecki/236/head 2025-08-14T21:20:15.1438514Z * [new branch] gh/mikaylagawarecki/237/base -> origin/gh/mikaylagawarecki/237/base 2025-08-14T21:20:15.1440283Z * [new branch] gh/mikaylagawarecki/237/head -> origin/gh/mikaylagawarecki/237/head 2025-08-14T21:20:15.1442672Z * [new branch] gh/mikaylagawarecki/238/base -> origin/gh/mikaylagawarecki/238/base 2025-08-14T21:20:15.1444574Z * [new branch] gh/mikaylagawarecki/238/head -> origin/gh/mikaylagawarecki/238/head 2025-08-14T21:20:15.1447396Z * [new branch] gh/mikaylagawarecki/313/base -> origin/gh/mikaylagawarecki/313/base 2025-08-14T21:20:15.1449162Z * [new branch] gh/mikaylagawarecki/313/head -> origin/gh/mikaylagawarecki/313/head 2025-08-14T21:20:15.1450964Z * [new branch] gh/mikaylagawarecki/313/orig -> origin/gh/mikaylagawarecki/313/orig 2025-08-14T21:20:15.1453458Z * [new branch] gh/mikaylagawarecki/317/base -> origin/gh/mikaylagawarecki/317/base 2025-08-14T21:20:15.1455329Z * [new branch] gh/mikaylagawarecki/317/head -> origin/gh/mikaylagawarecki/317/head 2025-08-14T21:20:15.1457095Z * [new branch] gh/mikaylagawarecki/317/orig -> origin/gh/mikaylagawarecki/317/orig 2025-08-14T21:20:15.1459567Z * [new branch] gh/mikaylagawarecki/318/base -> origin/gh/mikaylagawarecki/318/base 2025-08-14T21:20:15.1461324Z * [new branch] gh/mikaylagawarecki/318/head -> origin/gh/mikaylagawarecki/318/head 2025-08-14T21:20:15.1463126Z * [new branch] gh/mikaylagawarecki/318/orig -> origin/gh/mikaylagawarecki/318/orig 2025-08-14T21:20:15.1465689Z * [new branch] gh/mikaylagawarecki/319/base -> origin/gh/mikaylagawarecki/319/base 2025-08-14T21:20:15.1467371Z * [new branch] gh/mikaylagawarecki/319/head -> origin/gh/mikaylagawarecki/319/head 2025-08-14T21:20:15.1469134Z * [new branch] gh/mikaylagawarecki/319/orig -> origin/gh/mikaylagawarecki/319/orig 2025-08-14T21:20:15.1471558Z * [new branch] gh/mikaylagawarecki/320/base -> origin/gh/mikaylagawarecki/320/base 2025-08-14T21:20:15.1480782Z * [new branch] gh/mikaylagawarecki/320/head -> origin/gh/mikaylagawarecki/320/head 2025-08-14T21:20:15.1481478Z * [new branch] gh/mikaylagawarecki/320/orig -> origin/gh/mikaylagawarecki/320/orig 2025-08-14T21:20:15.1482097Z * [new branch] gh/mikaylagawarecki/321/base -> origin/gh/mikaylagawarecki/321/base 2025-08-14T21:20:15.1482696Z * [new branch] gh/mikaylagawarecki/321/head -> origin/gh/mikaylagawarecki/321/head 2025-08-14T21:20:15.1483301Z * [new branch] gh/mikaylagawarecki/321/orig -> origin/gh/mikaylagawarecki/321/orig 2025-08-14T21:20:15.1485223Z * [new branch] gh/mikaylagawarecki/322/base -> origin/gh/mikaylagawarecki/322/base 2025-08-14T21:20:15.1486476Z * [new branch] gh/mikaylagawarecki/322/head -> origin/gh/mikaylagawarecki/322/head 2025-08-14T21:20:15.1488356Z * [new branch] gh/mikaylagawarecki/322/orig -> origin/gh/mikaylagawarecki/322/orig 2025-08-14T21:20:15.1490821Z * [new branch] gh/mikaylagawarecki/323/base -> origin/gh/mikaylagawarecki/323/base 2025-08-14T21:20:15.1492662Z * [new branch] gh/mikaylagawarecki/323/head -> origin/gh/mikaylagawarecki/323/head 2025-08-14T21:20:15.1494497Z * [new branch] gh/mikaylagawarecki/323/orig -> origin/gh/mikaylagawarecki/323/orig 2025-08-14T21:20:15.1496872Z * [new branch] gh/mikaylagawarecki/324/base -> origin/gh/mikaylagawarecki/324/base 2025-08-14T21:20:15.1498656Z * [new branch] gh/mikaylagawarecki/324/head -> origin/gh/mikaylagawarecki/324/head 2025-08-14T21:20:15.1500580Z * [new branch] gh/mikaylagawarecki/324/orig -> origin/gh/mikaylagawarecki/324/orig 2025-08-14T21:20:15.1502877Z * [new branch] gh/mikaylagawarecki/325/base -> origin/gh/mikaylagawarecki/325/base 2025-08-14T21:20:15.1504646Z * [new branch] gh/mikaylagawarecki/325/head -> origin/gh/mikaylagawarecki/325/head 2025-08-14T21:20:15.1506593Z * [new branch] gh/mikaylagawarecki/325/orig -> origin/gh/mikaylagawarecki/325/orig 2025-08-14T21:20:15.1508779Z * [new branch] gh/mikaylagawarecki/326/base -> origin/gh/mikaylagawarecki/326/base 2025-08-14T21:20:15.1510518Z * [new branch] gh/mikaylagawarecki/326/head -> origin/gh/mikaylagawarecki/326/head 2025-08-14T21:20:15.1512289Z * [new branch] gh/mikaylagawarecki/326/orig -> origin/gh/mikaylagawarecki/326/orig 2025-08-14T21:20:15.1515155Z * [new branch] gh/mikaylagawarecki/327/base -> origin/gh/mikaylagawarecki/327/base 2025-08-14T21:20:15.1516957Z * [new branch] gh/mikaylagawarecki/327/head -> origin/gh/mikaylagawarecki/327/head 2025-08-14T21:20:15.1518745Z * [new branch] gh/mikaylagawarecki/327/orig -> origin/gh/mikaylagawarecki/327/orig 2025-08-14T21:20:15.1521478Z * [new branch] gh/mikaylagawarecki/328/base -> origin/gh/mikaylagawarecki/328/base 2025-08-14T21:20:15.1523307Z * [new branch] gh/mikaylagawarecki/328/head -> origin/gh/mikaylagawarecki/328/head 2025-08-14T21:20:15.1525358Z * [new branch] gh/mikaylagawarecki/328/orig -> origin/gh/mikaylagawarecki/328/orig 2025-08-14T21:20:15.1527876Z * [new branch] gh/mikaylagawarecki/329/base -> origin/gh/mikaylagawarecki/329/base 2025-08-14T21:20:15.1529651Z * [new branch] gh/mikaylagawarecki/329/head -> origin/gh/mikaylagawarecki/329/head 2025-08-14T21:20:15.1531444Z * [new branch] gh/mikaylagawarecki/329/orig -> origin/gh/mikaylagawarecki/329/orig 2025-08-14T21:20:15.1533973Z * [new branch] gh/mikaylagawarecki/330/base -> origin/gh/mikaylagawarecki/330/base 2025-08-14T21:20:15.1535853Z * [new branch] gh/mikaylagawarecki/330/head -> origin/gh/mikaylagawarecki/330/head 2025-08-14T21:20:15.1537670Z * [new branch] gh/mikaylagawarecki/330/orig -> origin/gh/mikaylagawarecki/330/orig 2025-08-14T21:20:15.1540237Z * [new branch] gh/mikaylagawarecki/331/base -> origin/gh/mikaylagawarecki/331/base 2025-08-14T21:20:15.1541965Z * [new branch] gh/mikaylagawarecki/331/head -> origin/gh/mikaylagawarecki/331/head 2025-08-14T21:20:15.1543695Z * [new branch] gh/mikaylagawarecki/331/orig -> origin/gh/mikaylagawarecki/331/orig 2025-08-14T21:20:15.1546476Z * [new branch] gh/mikaylagawarecki/332/base -> origin/gh/mikaylagawarecki/332/base 2025-08-14T21:20:15.1548463Z * [new branch] gh/mikaylagawarecki/332/head -> origin/gh/mikaylagawarecki/332/head 2025-08-14T21:20:15.1550812Z * [new branch] gh/mikaylagawarecki/332/orig -> origin/gh/mikaylagawarecki/332/orig 2025-08-14T21:20:15.1553232Z * [new branch] gh/mikaylagawarecki/333/base -> origin/gh/mikaylagawarecki/333/base 2025-08-14T21:20:15.1555055Z * [new branch] gh/mikaylagawarecki/333/head -> origin/gh/mikaylagawarecki/333/head 2025-08-14T21:20:15.1556876Z * [new branch] gh/mikaylagawarecki/333/orig -> origin/gh/mikaylagawarecki/333/orig 2025-08-14T21:20:15.1559422Z * [new branch] gh/mikaylagawarecki/334/base -> origin/gh/mikaylagawarecki/334/base 2025-08-14T21:20:15.1561097Z * [new branch] gh/mikaylagawarecki/334/head -> origin/gh/mikaylagawarecki/334/head 2025-08-14T21:20:15.1562833Z * [new branch] gh/mikaylagawarecki/334/orig -> origin/gh/mikaylagawarecki/334/orig 2025-08-14T21:20:15.1566119Z * [new branch] gh/mlazos/1/base -> origin/gh/mlazos/1/base 2025-08-14T21:20:15.1567964Z * [new branch] gh/mlazos/1/head -> origin/gh/mlazos/1/head 2025-08-14T21:20:15.1569726Z * [new branch] gh/mlazos/1/orig -> origin/gh/mlazos/1/orig 2025-08-14T21:20:15.1572778Z * [new branch] gh/mlazos/10/base -> origin/gh/mlazos/10/base 2025-08-14T21:20:15.1574641Z * [new branch] gh/mlazos/10/head -> origin/gh/mlazos/10/head 2025-08-14T21:20:15.1576556Z * [new branch] gh/mlazos/10/orig -> origin/gh/mlazos/10/orig 2025-08-14T21:20:15.1578741Z * [new branch] gh/mlazos/11/base -> origin/gh/mlazos/11/base 2025-08-14T21:20:15.1580486Z * [new branch] gh/mlazos/11/head -> origin/gh/mlazos/11/head 2025-08-14T21:20:15.1582313Z * [new branch] gh/mlazos/11/orig -> origin/gh/mlazos/11/orig 2025-08-14T21:20:15.1584774Z * [new branch] gh/mlazos/12/base -> origin/gh/mlazos/12/base 2025-08-14T21:20:15.1586515Z * [new branch] gh/mlazos/12/head -> origin/gh/mlazos/12/head 2025-08-14T21:20:15.1588681Z * [new branch] gh/mlazos/12/orig -> origin/gh/mlazos/12/orig 2025-08-14T21:20:15.1591244Z * [new branch] gh/mlazos/13/base -> origin/gh/mlazos/13/base 2025-08-14T21:20:15.1592981Z * [new branch] gh/mlazos/13/head -> origin/gh/mlazos/13/head 2025-08-14T21:20:15.1594771Z * [new branch] gh/mlazos/13/orig -> origin/gh/mlazos/13/orig 2025-08-14T21:20:15.1597107Z * [new branch] gh/mlazos/2/base -> origin/gh/mlazos/2/base 2025-08-14T21:20:15.1598944Z * [new branch] gh/mlazos/2/head -> origin/gh/mlazos/2/head 2025-08-14T21:20:15.1600635Z * [new branch] gh/mlazos/2/orig -> origin/gh/mlazos/2/orig 2025-08-14T21:20:15.1603044Z * [new branch] gh/mlazos/3/base -> origin/gh/mlazos/3/base 2025-08-14T21:20:15.1604896Z * [new branch] gh/mlazos/3/head -> origin/gh/mlazos/3/head 2025-08-14T21:20:15.1606722Z * [new branch] gh/mlazos/3/orig -> origin/gh/mlazos/3/orig 2025-08-14T21:20:15.1609807Z * [new branch] gh/mlazos/4/base -> origin/gh/mlazos/4/base 2025-08-14T21:20:15.1611592Z * [new branch] gh/mlazos/4/head -> origin/gh/mlazos/4/head 2025-08-14T21:20:15.1613584Z * [new branch] gh/mlazos/4/orig -> origin/gh/mlazos/4/orig 2025-08-14T21:20:15.1616160Z * [new branch] gh/mlazos/5/base -> origin/gh/mlazos/5/base 2025-08-14T21:20:15.1617923Z * [new branch] gh/mlazos/5/head -> origin/gh/mlazos/5/head 2025-08-14T21:20:15.1620205Z * [new branch] gh/mlazos/5/orig -> origin/gh/mlazos/5/orig 2025-08-14T21:20:15.1622706Z * [new branch] gh/mlazos/6/base -> origin/gh/mlazos/6/base 2025-08-14T21:20:15.1624503Z * [new branch] gh/mlazos/6/head -> origin/gh/mlazos/6/head 2025-08-14T21:20:15.1626251Z * [new branch] gh/mlazos/6/orig -> origin/gh/mlazos/6/orig 2025-08-14T21:20:15.1628875Z * [new branch] gh/mlazos/7/base -> origin/gh/mlazos/7/base 2025-08-14T21:20:15.1630688Z * [new branch] gh/mlazos/7/head -> origin/gh/mlazos/7/head 2025-08-14T21:20:15.1632440Z * [new branch] gh/mlazos/7/orig -> origin/gh/mlazos/7/orig 2025-08-14T21:20:15.1634856Z * [new branch] gh/mlazos/8/base -> origin/gh/mlazos/8/base 2025-08-14T21:20:15.1636664Z * [new branch] gh/mlazos/8/head -> origin/gh/mlazos/8/head 2025-08-14T21:20:15.1638487Z * [new branch] gh/mlazos/8/orig -> origin/gh/mlazos/8/orig 2025-08-14T21:20:15.1641003Z * [new branch] gh/mlazos/9/base -> origin/gh/mlazos/9/base 2025-08-14T21:20:15.1643013Z * [new branch] gh/mlazos/9/head -> origin/gh/mlazos/9/head 2025-08-14T21:20:15.1644891Z * [new branch] gh/mlazos/9/orig -> origin/gh/mlazos/9/orig 2025-08-14T21:20:15.1648452Z * [new branch] gh/mrmiywj/1/base -> origin/gh/mrmiywj/1/base 2025-08-14T21:20:15.1650306Z * [new branch] gh/mrmiywj/1/head -> origin/gh/mrmiywj/1/head 2025-08-14T21:20:15.1653330Z * [new branch] gh/muchulee8/62/base -> origin/gh/muchulee8/62/base 2025-08-14T21:20:15.1655422Z * [new branch] gh/muchulee8/62/head -> origin/gh/muchulee8/62/head 2025-08-14T21:20:15.1657107Z * [new branch] gh/muchulee8/62/orig -> origin/gh/muchulee8/62/orig 2025-08-14T21:20:15.1659537Z * [new branch] gh/muchulee8/63/base -> origin/gh/muchulee8/63/base 2025-08-14T21:20:15.1661322Z * [new branch] gh/muchulee8/63/head -> origin/gh/muchulee8/63/head 2025-08-14T21:20:15.1663113Z * [new branch] gh/muchulee8/63/orig -> origin/gh/muchulee8/63/orig 2025-08-14T21:20:15.1665765Z * [new branch] gh/muchulee8/64/base -> origin/gh/muchulee8/64/base 2025-08-14T21:20:15.1667520Z * [new branch] gh/muchulee8/64/head -> origin/gh/muchulee8/64/head 2025-08-14T21:20:15.1669362Z * [new branch] gh/muchulee8/64/orig -> origin/gh/muchulee8/64/orig 2025-08-14T21:20:15.1672093Z * [new branch] gh/muchulee8/65/base -> origin/gh/muchulee8/65/base 2025-08-14T21:20:15.1673975Z * [new branch] gh/muchulee8/65/head -> origin/gh/muchulee8/65/head 2025-08-14T21:20:15.1675887Z * [new branch] gh/muchulee8/65/orig -> origin/gh/muchulee8/65/orig 2025-08-14T21:20:15.1678847Z * [new branch] gh/oulgen/35/base -> origin/gh/oulgen/35/base 2025-08-14T21:20:15.1680627Z * [new branch] gh/oulgen/35/head -> origin/gh/oulgen/35/head 2025-08-14T21:20:15.1682419Z * [new branch] gh/oulgen/35/orig -> origin/gh/oulgen/35/orig 2025-08-14T21:20:15.1685689Z * [new branch] gh/oulgen/44/base -> origin/gh/oulgen/44/base 2025-08-14T21:20:15.1687493Z * [new branch] gh/oulgen/44/head -> origin/gh/oulgen/44/head 2025-08-14T21:20:15.1689182Z * [new branch] gh/oulgen/44/orig -> origin/gh/oulgen/44/orig 2025-08-14T21:20:15.1691793Z * [new branch] gh/oulgen/45/base -> origin/gh/oulgen/45/base 2025-08-14T21:20:15.1693516Z * [new branch] gh/oulgen/45/head -> origin/gh/oulgen/45/head 2025-08-14T21:20:15.1695337Z * [new branch] gh/oulgen/45/orig -> origin/gh/oulgen/45/orig 2025-08-14T21:20:15.1697868Z * [new branch] gh/oulgen/46/base -> origin/gh/oulgen/46/base 2025-08-14T21:20:15.1699627Z * [new branch] gh/oulgen/46/head -> origin/gh/oulgen/46/head 2025-08-14T21:20:15.1701428Z * [new branch] gh/oulgen/46/orig -> origin/gh/oulgen/46/orig 2025-08-14T21:20:15.1703845Z * [new branch] gh/oulgen/47/base -> origin/gh/oulgen/47/base 2025-08-14T21:20:15.1705640Z * [new branch] gh/oulgen/47/head -> origin/gh/oulgen/47/head 2025-08-14T21:20:15.1707384Z * [new branch] gh/oulgen/47/orig -> origin/gh/oulgen/47/orig 2025-08-14T21:20:15.1710696Z * [new branch] gh/pearu/108/base -> origin/gh/pearu/108/base 2025-08-14T21:20:15.1712497Z * [new branch] gh/pearu/108/head -> origin/gh/pearu/108/head 2025-08-14T21:20:15.1714375Z * [new branch] gh/pearu/108/orig -> origin/gh/pearu/108/orig 2025-08-14T21:20:15.1717185Z * [new branch] gh/pearu/56/base -> origin/gh/pearu/56/base 2025-08-14T21:20:15.1719120Z * [new branch] gh/pearu/56/head -> origin/gh/pearu/56/head 2025-08-14T21:20:15.1721036Z * [new branch] gh/pearu/56/orig -> origin/gh/pearu/56/orig 2025-08-14T21:20:15.1723672Z * [new branch] gh/pearu/97/base -> origin/gh/pearu/97/base 2025-08-14T21:20:15.1725713Z * [new branch] gh/pearu/97/head -> origin/gh/pearu/97/head 2025-08-14T21:20:15.1727551Z * [new branch] gh/pearu/97/orig -> origin/gh/pearu/97/orig 2025-08-14T21:20:15.1730652Z * [new branch] gh/qqaatw/29/base -> origin/gh/qqaatw/29/base 2025-08-14T21:20:15.1732308Z * [new branch] gh/qqaatw/29/head -> origin/gh/qqaatw/29/head 2025-08-14T21:20:15.1734065Z * [new branch] gh/qqaatw/29/orig -> origin/gh/qqaatw/29/orig 2025-08-14T21:20:15.1737108Z * [new branch] gh/raymo/cleanup-dynamo-logging -> origin/gh/raymo/cleanup-dynamo-logging 2025-08-14T21:20:15.1738835Z * [new branch] gh/raymo/refresh-script -> origin/gh/raymo/refresh-script 2025-08-14T21:20:15.1741781Z * [new branch] gh/rec/141/base -> origin/gh/rec/141/base 2025-08-14T21:20:15.1743648Z * [new branch] gh/rec/141/head -> origin/gh/rec/141/head 2025-08-14T21:20:15.1746059Z * [new branch] gh/rec/153/base -> origin/gh/rec/153/base 2025-08-14T21:20:15.1748587Z * [new branch] gh/rec/153/head -> origin/gh/rec/153/head 2025-08-14T21:20:15.1751632Z * [new branch] gh/rec/153/orig -> origin/gh/rec/153/orig 2025-08-14T21:20:15.1754060Z * [new branch] gh/rec/154/base -> origin/gh/rec/154/base 2025-08-14T21:20:15.1755822Z * [new branch] gh/rec/154/head -> origin/gh/rec/154/head 2025-08-14T21:20:15.1757576Z * [new branch] gh/rec/154/orig -> origin/gh/rec/154/orig 2025-08-14T21:20:15.1760005Z * [new branch] gh/rec/156/base -> origin/gh/rec/156/base 2025-08-14T21:20:15.1761776Z * [new branch] gh/rec/156/head -> origin/gh/rec/156/head 2025-08-14T21:20:15.1763544Z * [new branch] gh/rec/156/orig -> origin/gh/rec/156/orig 2025-08-14T21:20:15.1766151Z * [new branch] gh/rec/158/base -> origin/gh/rec/158/base 2025-08-14T21:20:15.1767920Z * [new branch] gh/rec/158/head -> origin/gh/rec/158/head 2025-08-14T21:20:15.1769845Z * [new branch] gh/rec/158/orig -> origin/gh/rec/158/orig 2025-08-14T21:20:15.1772263Z * [new branch] gh/rec/159/base -> origin/gh/rec/159/base 2025-08-14T21:20:15.1773991Z * [new branch] gh/rec/159/head -> origin/gh/rec/159/head 2025-08-14T21:20:15.1776442Z * [new branch] gh/rec/160/base -> origin/gh/rec/160/base 2025-08-14T21:20:15.1778144Z * [new branch] gh/rec/160/head -> origin/gh/rec/160/head 2025-08-14T21:20:15.1779960Z * [new branch] gh/rec/160/orig -> origin/gh/rec/160/orig 2025-08-14T21:20:15.1782443Z * [new branch] gh/rec/161/base -> origin/gh/rec/161/base 2025-08-14T21:20:15.1784260Z * [new branch] gh/rec/161/head -> origin/gh/rec/161/head 2025-08-14T21:20:15.1786001Z * [new branch] gh/rec/161/orig -> origin/gh/rec/161/orig 2025-08-14T21:20:15.1788506Z * [new branch] gh/rec/162/base -> origin/gh/rec/162/base 2025-08-14T21:20:15.1790345Z * [new branch] gh/rec/162/head -> origin/gh/rec/162/head 2025-08-14T21:20:15.1792130Z * [new branch] gh/rec/162/orig -> origin/gh/rec/162/orig 2025-08-14T21:20:15.1794615Z * [new branch] gh/rec/163/base -> origin/gh/rec/163/base 2025-08-14T21:20:15.1796363Z * [new branch] gh/rec/163/head -> origin/gh/rec/163/head 2025-08-14T21:20:15.1798116Z * [new branch] gh/rec/163/orig -> origin/gh/rec/163/orig 2025-08-14T21:20:15.1800547Z * [new branch] gh/rec/164/base -> origin/gh/rec/164/base 2025-08-14T21:20:15.1802299Z * [new branch] gh/rec/164/head -> origin/gh/rec/164/head 2025-08-14T21:20:15.1804057Z * [new branch] gh/rec/164/orig -> origin/gh/rec/164/orig 2025-08-14T21:20:15.1807458Z * [new branch] gh/robert-hardwick/1/base -> origin/gh/robert-hardwick/1/base 2025-08-14T21:20:15.1809108Z * [new branch] gh/robert-hardwick/1/head -> origin/gh/robert-hardwick/1/head 2025-08-14T21:20:15.1810933Z * [new branch] gh/robert-hardwick/1/orig -> origin/gh/robert-hardwick/1/orig 2025-08-14T21:20:15.1813378Z * [new branch] gh/robert-hardwick/2/base -> origin/gh/robert-hardwick/2/base 2025-08-14T21:20:15.1815128Z * [new branch] gh/robert-hardwick/2/head -> origin/gh/robert-hardwick/2/head 2025-08-14T21:20:15.1817049Z * [new branch] gh/robert-hardwick/2/orig -> origin/gh/robert-hardwick/2/orig 2025-08-14T21:20:15.1819409Z * [new branch] gh/robert-hardwick/3/base -> origin/gh/robert-hardwick/3/base 2025-08-14T21:20:15.1821156Z * [new branch] gh/robert-hardwick/3/head -> origin/gh/robert-hardwick/3/head 2025-08-14T21:20:15.1822907Z * [new branch] gh/robert-hardwick/3/orig -> origin/gh/robert-hardwick/3/orig 2025-08-14T21:20:15.1825347Z * [new branch] gh/robert-hardwick/4/base -> origin/gh/robert-hardwick/4/base 2025-08-14T21:20:15.1827165Z * [new branch] gh/robert-hardwick/4/head -> origin/gh/robert-hardwick/4/head 2025-08-14T21:20:15.1828983Z * [new branch] gh/robert-hardwick/4/orig -> origin/gh/robert-hardwick/4/orig 2025-08-14T21:20:15.1831913Z * [new branch] gh/rtimpe/1/base -> origin/gh/rtimpe/1/base 2025-08-14T21:20:15.1833695Z * [new branch] gh/rtimpe/1/head -> origin/gh/rtimpe/1/head 2025-08-14T21:20:15.1836172Z * [new branch] gh/rtimpe/10/base -> origin/gh/rtimpe/10/base 2025-08-14T21:20:15.1837918Z * [new branch] gh/rtimpe/10/head -> origin/gh/rtimpe/10/head 2025-08-14T21:20:15.1839768Z * [new branch] gh/rtimpe/10/orig -> origin/gh/rtimpe/10/orig 2025-08-14T21:20:15.1842200Z * [new branch] gh/rtimpe/11/base -> origin/gh/rtimpe/11/base 2025-08-14T21:20:15.1843970Z * [new branch] gh/rtimpe/11/head -> origin/gh/rtimpe/11/head 2025-08-14T21:20:15.1845895Z * [new branch] gh/rtimpe/11/orig -> origin/gh/rtimpe/11/orig 2025-08-14T21:20:15.1849211Z * [new branch] gh/rtimpe/12/base -> origin/gh/rtimpe/12/base 2025-08-14T21:20:15.1851051Z * [new branch] gh/rtimpe/12/head -> origin/gh/rtimpe/12/head 2025-08-14T21:20:15.1852861Z * [new branch] gh/rtimpe/12/orig -> origin/gh/rtimpe/12/orig 2025-08-14T21:20:15.1855263Z * [new branch] gh/rtimpe/2/base -> origin/gh/rtimpe/2/base 2025-08-14T21:20:15.1856988Z * [new branch] gh/rtimpe/2/head -> origin/gh/rtimpe/2/head 2025-08-14T21:20:15.1859405Z * [new branch] gh/rtimpe/3/base -> origin/gh/rtimpe/3/base 2025-08-14T21:20:15.1861103Z * [new branch] gh/rtimpe/3/head -> origin/gh/rtimpe/3/head 2025-08-14T21:20:15.1863612Z * [new branch] gh/rtimpe/4/base -> origin/gh/rtimpe/4/base 2025-08-14T21:20:15.1865346Z * [new branch] gh/rtimpe/4/head -> origin/gh/rtimpe/4/head 2025-08-14T21:20:15.1867810Z * [new branch] gh/rtimpe/5/base -> origin/gh/rtimpe/5/base 2025-08-14T21:20:15.1869549Z * [new branch] gh/rtimpe/5/head -> origin/gh/rtimpe/5/head 2025-08-14T21:20:15.1871356Z * [new branch] gh/rtimpe/5/orig -> origin/gh/rtimpe/5/orig 2025-08-14T21:20:15.1873877Z * [new branch] gh/rtimpe/6/base -> origin/gh/rtimpe/6/base 2025-08-14T21:20:15.1875607Z * [new branch] gh/rtimpe/6/head -> origin/gh/rtimpe/6/head 2025-08-14T21:20:15.1877349Z * [new branch] gh/rtimpe/6/orig -> origin/gh/rtimpe/6/orig 2025-08-14T21:20:15.1879736Z * [new branch] gh/rtimpe/7/base -> origin/gh/rtimpe/7/base 2025-08-14T21:20:15.1881659Z * [new branch] gh/rtimpe/7/head -> origin/gh/rtimpe/7/head 2025-08-14T21:20:15.1883284Z * [new branch] gh/rtimpe/7/orig -> origin/gh/rtimpe/7/orig 2025-08-14T21:20:15.1885924Z * [new branch] gh/rtimpe/8/base -> origin/gh/rtimpe/8/base 2025-08-14T21:20:15.1887789Z * [new branch] gh/rtimpe/8/head -> origin/gh/rtimpe/8/head 2025-08-14T21:20:15.1889502Z * [new branch] gh/rtimpe/8/orig -> origin/gh/rtimpe/8/orig 2025-08-14T21:20:15.1892301Z * [new branch] gh/rtimpe/9/base -> origin/gh/rtimpe/9/base 2025-08-14T21:20:15.1894047Z * [new branch] gh/rtimpe/9/head -> origin/gh/rtimpe/9/head 2025-08-14T21:20:15.1895851Z * [new branch] gh/rtimpe/9/orig -> origin/gh/rtimpe/9/orig 2025-08-14T21:20:15.1899147Z * [new branch] gh/ruisizhang123/1/base -> origin/gh/ruisizhang123/1/base 2025-08-14T21:20:15.1901068Z * [new branch] gh/ruisizhang123/1/head -> origin/gh/ruisizhang123/1/head 2025-08-14T21:20:15.1902778Z * [new branch] gh/ruisizhang123/1/orig -> origin/gh/ruisizhang123/1/orig 2025-08-14T21:20:15.1905276Z * [new branch] gh/ruisizhang123/4/base -> origin/gh/ruisizhang123/4/base 2025-08-14T21:20:15.1907301Z * [new branch] gh/ruisizhang123/4/head -> origin/gh/ruisizhang123/4/head 2025-08-14T21:20:15.1908994Z * [new branch] gh/ruisizhang123/4/orig -> origin/gh/ruisizhang123/4/orig 2025-08-14T21:20:15.1911525Z * [new branch] gh/ruisizhang123/5/base -> origin/gh/ruisizhang123/5/base 2025-08-14T21:20:15.1913749Z * [new branch] gh/ruisizhang123/5/head -> origin/gh/ruisizhang123/5/head 2025-08-14T21:20:15.1915567Z * [new branch] gh/ruisizhang123/5/orig -> origin/gh/ruisizhang123/5/orig 2025-08-14T21:20:15.1918014Z * [new branch] gh/ruisizhang123/6/base -> origin/gh/ruisizhang123/6/base 2025-08-14T21:20:15.1919873Z * [new branch] gh/ruisizhang123/6/head -> origin/gh/ruisizhang123/6/head 2025-08-14T21:20:15.1921636Z * [new branch] gh/ruisizhang123/6/orig -> origin/gh/ruisizhang123/6/orig 2025-08-14T21:20:15.1924389Z * [new branch] gh/ruisizhang123/7/base -> origin/gh/ruisizhang123/7/base 2025-08-14T21:20:15.1927095Z * [new branch] gh/ruisizhang123/7/head -> origin/gh/ruisizhang123/7/head 2025-08-14T21:20:15.1928901Z * [new branch] gh/ruisizhang123/7/orig -> origin/gh/ruisizhang123/7/orig 2025-08-14T21:20:15.1931388Z * [new branch] gh/ruisizhang123/8/base -> origin/gh/ruisizhang123/8/base 2025-08-14T21:20:15.1933171Z * [new branch] gh/ruisizhang123/8/head -> origin/gh/ruisizhang123/8/head 2025-08-14T21:20:15.1934929Z * [new branch] gh/ruisizhang123/8/orig -> origin/gh/ruisizhang123/8/orig 2025-08-14T21:20:15.1937986Z * [new branch] gh/sarckk/2/base -> origin/gh/sarckk/2/base 2025-08-14T21:20:15.1939693Z * [new branch] gh/sarckk/2/head -> origin/gh/sarckk/2/head 2025-08-14T21:20:15.1941496Z * [new branch] gh/sarckk/2/orig -> origin/gh/sarckk/2/orig 2025-08-14T21:20:15.1944569Z * [new branch] gh/seemethere/23/head -> origin/gh/seemethere/23/head 2025-08-14T21:20:15.1947739Z * [new branch] gh/seemethere/24/base -> origin/gh/seemethere/24/base 2025-08-14T21:20:15.1949575Z * [new branch] gh/seemethere/24/head -> origin/gh/seemethere/24/head 2025-08-14T21:20:15.1951504Z * [new branch] gh/seemethere/24/orig -> origin/gh/seemethere/24/orig 2025-08-14T21:20:15.1953884Z * [new branch] gh/seemethere/30/base -> origin/gh/seemethere/30/base 2025-08-14T21:20:15.1955729Z * [new branch] gh/seemethere/30/head -> origin/gh/seemethere/30/head 2025-08-14T21:20:15.1957685Z * [new branch] gh/seemethere/30/orig -> origin/gh/seemethere/30/orig 2025-08-14T21:20:15.1960104Z * [new branch] gh/seemethere/32/base -> origin/gh/seemethere/32/base 2025-08-14T21:20:15.1961846Z * [new branch] gh/seemethere/32/head -> origin/gh/seemethere/32/head 2025-08-14T21:20:15.1963647Z * [new branch] gh/seemethere/32/orig -> origin/gh/seemethere/32/orig 2025-08-14T21:20:15.1966363Z * [new branch] gh/seemethere/33/base -> origin/gh/seemethere/33/base 2025-08-14T21:20:15.1968102Z * [new branch] gh/seemethere/33/head -> origin/gh/seemethere/33/head 2025-08-14T21:20:15.1969905Z * [new branch] gh/seemethere/33/orig -> origin/gh/seemethere/33/orig 2025-08-14T21:20:15.1972406Z * [new branch] gh/seemethere/34/base -> origin/gh/seemethere/34/base 2025-08-14T21:20:15.1974112Z * [new branch] gh/seemethere/34/head -> origin/gh/seemethere/34/head 2025-08-14T21:20:15.1975947Z * [new branch] gh/seemethere/34/orig -> origin/gh/seemethere/34/orig 2025-08-14T21:20:15.1978354Z * [new branch] gh/seemethere/35/base -> origin/gh/seemethere/35/base 2025-08-14T21:20:15.1980604Z * [new branch] gh/seemethere/35/head -> origin/gh/seemethere/35/head 2025-08-14T21:20:15.1982328Z * [new branch] gh/seemethere/35/orig -> origin/gh/seemethere/35/orig 2025-08-14T21:20:15.1985000Z * [new branch] gh/seemethere/37/base -> origin/gh/seemethere/37/base 2025-08-14T21:20:15.1986813Z * [new branch] gh/seemethere/37/head -> origin/gh/seemethere/37/head 2025-08-14T21:20:15.1988573Z * [new branch] gh/seemethere/37/orig -> origin/gh/seemethere/37/orig 2025-08-14T21:20:15.1991057Z * [new branch] gh/seemethere/39/base -> origin/gh/seemethere/39/base 2025-08-14T21:20:15.1992822Z * [new branch] gh/seemethere/39/head -> origin/gh/seemethere/39/head 2025-08-14T21:20:15.1994606Z * [new branch] gh/seemethere/39/orig -> origin/gh/seemethere/39/orig 2025-08-14T21:20:15.1997055Z * [new branch] gh/seemethere/40/base -> origin/gh/seemethere/40/base 2025-08-14T21:20:15.1998747Z * [new branch] gh/seemethere/40/head -> origin/gh/seemethere/40/head 2025-08-14T21:20:15.2000597Z * [new branch] gh/seemethere/40/orig -> origin/gh/seemethere/40/orig 2025-08-14T21:20:15.2003024Z * [new branch] gh/seemethere/41/base -> origin/gh/seemethere/41/base 2025-08-14T21:20:15.2004921Z * [new branch] gh/seemethere/41/head -> origin/gh/seemethere/41/head 2025-08-14T21:20:15.2006874Z * [new branch] gh/seemethere/41/orig -> origin/gh/seemethere/41/orig 2025-08-14T21:20:15.2009251Z * [new branch] gh/seemethere/42/base -> origin/gh/seemethere/42/base 2025-08-14T21:20:15.2011232Z * [new branch] gh/seemethere/42/head -> origin/gh/seemethere/42/head 2025-08-14T21:20:15.2013023Z * [new branch] gh/seemethere/42/orig -> origin/gh/seemethere/42/orig 2025-08-14T21:20:15.2015590Z * [new branch] gh/seemethere/43/base -> origin/gh/seemethere/43/base 2025-08-14T21:20:15.2017387Z * [new branch] gh/seemethere/43/head -> origin/gh/seemethere/43/head 2025-08-14T21:20:15.2019120Z * [new branch] gh/seemethere/43/orig -> origin/gh/seemethere/43/orig 2025-08-14T21:20:15.2021525Z * [new branch] gh/seemethere/44/base -> origin/gh/seemethere/44/base 2025-08-14T21:20:15.2023315Z * [new branch] gh/seemethere/44/head -> origin/gh/seemethere/44/head 2025-08-14T21:20:15.2025138Z * [new branch] gh/seemethere/44/orig -> origin/gh/seemethere/44/orig 2025-08-14T21:20:15.2027684Z * [new branch] gh/seemethere/45/base -> origin/gh/seemethere/45/base 2025-08-14T21:20:15.2029255Z * [new branch] gh/seemethere/45/head -> origin/gh/seemethere/45/head 2025-08-14T21:20:15.2031193Z * [new branch] gh/seemethere/45/orig -> origin/gh/seemethere/45/orig 2025-08-14T21:20:15.2033761Z * [new branch] gh/seemethere/46/base -> origin/gh/seemethere/46/base 2025-08-14T21:20:15.2035532Z * [new branch] gh/seemethere/46/head -> origin/gh/seemethere/46/head 2025-08-14T21:20:15.2037237Z * [new branch] gh/seemethere/46/orig -> origin/gh/seemethere/46/orig 2025-08-14T21:20:15.2039716Z * [new branch] gh/seemethere/47/base -> origin/gh/seemethere/47/base 2025-08-14T21:20:15.2041513Z * [new branch] gh/seemethere/47/head -> origin/gh/seemethere/47/head 2025-08-14T21:20:15.2043328Z * [new branch] gh/seemethere/47/orig -> origin/gh/seemethere/47/orig 2025-08-14T21:20:15.2045930Z * [new branch] gh/seemethere/48/base -> origin/gh/seemethere/48/base 2025-08-14T21:20:15.2047806Z * [new branch] gh/seemethere/48/head -> origin/gh/seemethere/48/head 2025-08-14T21:20:15.2051441Z * [new branch] gh/seemethere/48/orig -> origin/gh/seemethere/48/orig 2025-08-14T21:20:15.2054355Z * [new branch] gh/seemethere/49/base -> origin/gh/seemethere/49/base 2025-08-14T21:20:15.2056145Z * [new branch] gh/seemethere/49/head -> origin/gh/seemethere/49/head 2025-08-14T21:20:15.2057936Z * [new branch] gh/seemethere/49/orig -> origin/gh/seemethere/49/orig 2025-08-14T21:20:15.2060486Z * [new branch] gh/seemethere/50/base -> origin/gh/seemethere/50/base 2025-08-14T21:20:15.2062317Z * [new branch] gh/seemethere/50/head -> origin/gh/seemethere/50/head 2025-08-14T21:20:15.2064132Z * [new branch] gh/seemethere/50/orig -> origin/gh/seemethere/50/orig 2025-08-14T21:20:15.2066666Z * [new branch] gh/seemethere/51/base -> origin/gh/seemethere/51/base 2025-08-14T21:20:15.2068387Z * [new branch] gh/seemethere/51/head -> origin/gh/seemethere/51/head 2025-08-14T21:20:15.2070051Z * [new branch] gh/seemethere/51/orig -> origin/gh/seemethere/51/orig 2025-08-14T21:20:15.2072585Z * [new branch] gh/seemethere/52/base -> origin/gh/seemethere/52/base 2025-08-14T21:20:15.2074435Z * [new branch] gh/seemethere/52/head -> origin/gh/seemethere/52/head 2025-08-14T21:20:15.2076269Z * [new branch] gh/seemethere/52/orig -> origin/gh/seemethere/52/orig 2025-08-14T21:20:15.2078760Z * [new branch] gh/seemethere/53/base -> origin/gh/seemethere/53/base 2025-08-14T21:20:15.2080534Z * [new branch] gh/seemethere/53/head -> origin/gh/seemethere/53/head 2025-08-14T21:20:15.2082293Z * [new branch] gh/seemethere/53/orig -> origin/gh/seemethere/53/orig 2025-08-14T21:20:15.2085065Z * [new branch] gh/seemethere/54/base -> origin/gh/seemethere/54/base 2025-08-14T21:20:15.2086935Z * [new branch] gh/seemethere/54/head -> origin/gh/seemethere/54/head 2025-08-14T21:20:15.2088746Z * [new branch] gh/seemethere/54/orig -> origin/gh/seemethere/54/orig 2025-08-14T21:20:15.2091123Z * [new branch] gh/seemethere/55/base -> origin/gh/seemethere/55/base 2025-08-14T21:20:15.2092965Z * [new branch] gh/seemethere/55/head -> origin/gh/seemethere/55/head 2025-08-14T21:20:15.2094735Z * [new branch] gh/seemethere/55/orig -> origin/gh/seemethere/55/orig 2025-08-14T21:20:15.2097291Z * [new branch] gh/seemethere/56/base -> origin/gh/seemethere/56/base 2025-08-14T21:20:15.2099065Z * [new branch] gh/seemethere/56/head -> origin/gh/seemethere/56/head 2025-08-14T21:20:15.2100972Z * [new branch] gh/seemethere/56/orig -> origin/gh/seemethere/56/orig 2025-08-14T21:20:15.2103315Z * [new branch] gh/seemethere/57/base -> origin/gh/seemethere/57/base 2025-08-14T21:20:15.2105129Z * [new branch] gh/seemethere/57/head -> origin/gh/seemethere/57/head 2025-08-14T21:20:15.2106846Z * [new branch] gh/seemethere/57/orig -> origin/gh/seemethere/57/orig 2025-08-14T21:20:15.2109857Z * [new branch] gh/seemethere/58/base -> origin/gh/seemethere/58/base 2025-08-14T21:20:15.2111671Z * [new branch] gh/seemethere/58/head -> origin/gh/seemethere/58/head 2025-08-14T21:20:15.2113482Z * [new branch] gh/seemethere/58/orig -> origin/gh/seemethere/58/orig 2025-08-14T21:20:15.2115991Z * [new branch] gh/seemethere/59/base -> origin/gh/seemethere/59/base 2025-08-14T21:20:15.2117737Z * [new branch] gh/seemethere/59/head -> origin/gh/seemethere/59/head 2025-08-14T21:20:15.2119527Z * [new branch] gh/seemethere/59/orig -> origin/gh/seemethere/59/orig 2025-08-14T21:20:15.2122094Z * [new branch] gh/seemethere/7/head -> origin/gh/seemethere/7/head 2025-08-14T21:20:15.2125437Z * [new branch] gh/shunting314/145/base -> origin/gh/shunting314/145/base 2025-08-14T21:20:15.2127342Z * [new branch] gh/shunting314/145/head -> origin/gh/shunting314/145/head 2025-08-14T21:20:15.2129244Z * [new branch] gh/shunting314/145/orig -> origin/gh/shunting314/145/orig 2025-08-14T21:20:15.2131951Z * [new branch] gh/shunting314/176/base -> origin/gh/shunting314/176/base 2025-08-14T21:20:15.2133907Z * [new branch] gh/shunting314/176/head -> origin/gh/shunting314/176/head 2025-08-14T21:20:15.2135711Z * [new branch] gh/shunting314/176/orig -> origin/gh/shunting314/176/orig 2025-08-14T21:20:15.2138148Z * [new branch] gh/shunting314/211/base -> origin/gh/shunting314/211/base 2025-08-14T21:20:15.2139978Z * [new branch] gh/shunting314/211/head -> origin/gh/shunting314/211/head 2025-08-14T21:20:15.2141791Z * [new branch] gh/shunting314/211/orig -> origin/gh/shunting314/211/orig 2025-08-14T21:20:15.2144075Z * [new branch] gh/shunting314/212/base -> origin/gh/shunting314/212/base 2025-08-14T21:20:15.2145842Z * [new branch] gh/shunting314/212/head -> origin/gh/shunting314/212/head 2025-08-14T21:20:15.2147762Z * [new branch] gh/shunting314/212/orig -> origin/gh/shunting314/212/orig 2025-08-14T21:20:15.2150789Z * [new branch] gh/shunting314/213/base -> origin/gh/shunting314/213/base 2025-08-14T21:20:15.2152592Z * [new branch] gh/shunting314/213/head -> origin/gh/shunting314/213/head 2025-08-14T21:20:15.2154435Z * [new branch] gh/shunting314/213/orig -> origin/gh/shunting314/213/orig 2025-08-14T21:20:15.2158042Z * [new branch] gh/silverguo/1/base -> origin/gh/silverguo/1/base 2025-08-14T21:20:15.2159759Z * [new branch] gh/silverguo/1/head -> origin/gh/silverguo/1/head 2025-08-14T21:20:15.2162078Z * [new branch] gh/silverguo/2/base -> origin/gh/silverguo/2/base 2025-08-14T21:20:15.2164170Z * [new branch] gh/silverguo/2/head -> origin/gh/silverguo/2/head 2025-08-14T21:20:15.2166750Z * [new branch] gh/silverguo/3/base -> origin/gh/silverguo/3/base 2025-08-14T21:20:15.2168514Z * [new branch] gh/silverguo/3/head -> origin/gh/silverguo/3/head 2025-08-14T21:20:15.2170836Z * [new branch] gh/silverguo/4/base -> origin/gh/silverguo/4/base 2025-08-14T21:20:15.2172585Z * [new branch] gh/silverguo/4/head -> origin/gh/silverguo/4/head 2025-08-14T21:20:15.2176252Z * [new branch] gh/sinhaanhsul/1/base -> origin/gh/sinhaanhsul/1/base 2025-08-14T21:20:15.2178272Z * [new branch] gh/sinhaanhsul/1/head -> origin/gh/sinhaanhsul/1/head 2025-08-14T21:20:15.2181211Z * [new branch] gh/skarjala/11/base -> origin/gh/skarjala/11/base 2025-08-14T21:20:15.2183054Z * [new branch] gh/skarjala/11/head -> origin/gh/skarjala/11/head 2025-08-14T21:20:15.2184721Z * [new branch] gh/skarjala/11/orig -> origin/gh/skarjala/11/orig 2025-08-14T21:20:15.2187224Z * [new branch] gh/skarjala/13/base -> origin/gh/skarjala/13/base 2025-08-14T21:20:15.2188975Z * [new branch] gh/skarjala/13/head -> origin/gh/skarjala/13/head 2025-08-14T21:20:15.2190792Z * [new branch] gh/skarjala/13/orig -> origin/gh/skarjala/13/orig 2025-08-14T21:20:15.2193190Z * [new branch] gh/skarjala/14/base -> origin/gh/skarjala/14/base 2025-08-14T21:20:15.2194975Z * [new branch] gh/skarjala/14/head -> origin/gh/skarjala/14/head 2025-08-14T21:20:15.2196746Z * [new branch] gh/skarjala/14/orig -> origin/gh/skarjala/14/orig 2025-08-14T21:20:15.2199141Z * [new branch] gh/skarjala/15/base -> origin/gh/skarjala/15/base 2025-08-14T21:20:15.2200908Z * [new branch] gh/skarjala/15/head -> origin/gh/skarjala/15/head 2025-08-14T21:20:15.2202711Z * [new branch] gh/skarjala/15/orig -> origin/gh/skarjala/15/orig 2025-08-14T21:20:15.2205425Z * [new branch] gh/skarjala/16/base -> origin/gh/skarjala/16/base 2025-08-14T21:20:15.2207279Z * [new branch] gh/skarjala/16/head -> origin/gh/skarjala/16/head 2025-08-14T21:20:15.2208982Z * [new branch] gh/skarjala/16/orig -> origin/gh/skarjala/16/orig 2025-08-14T21:20:15.2211500Z * [new branch] gh/skarjala/17/base -> origin/gh/skarjala/17/base 2025-08-14T21:20:15.2213464Z * [new branch] gh/skarjala/17/head -> origin/gh/skarjala/17/head 2025-08-14T21:20:15.2215201Z * [new branch] gh/skarjala/17/orig -> origin/gh/skarjala/17/orig 2025-08-14T21:20:15.2217673Z * [new branch] gh/skarjala/18/base -> origin/gh/skarjala/18/base 2025-08-14T21:20:15.2219453Z * [new branch] gh/skarjala/18/head -> origin/gh/skarjala/18/head 2025-08-14T21:20:15.2221161Z * [new branch] gh/skarjala/18/orig -> origin/gh/skarjala/18/orig 2025-08-14T21:20:15.2223615Z * [new branch] gh/skarjala/19/base -> origin/gh/skarjala/19/base 2025-08-14T21:20:15.2225436Z * [new branch] gh/skarjala/19/head -> origin/gh/skarjala/19/head 2025-08-14T21:20:15.2227202Z * [new branch] gh/skarjala/19/orig -> origin/gh/skarjala/19/orig 2025-08-14T21:20:15.2230352Z * [new branch] gh/soulitzer/269/base -> origin/gh/soulitzer/269/base 2025-08-14T21:20:15.2232214Z * [new branch] gh/soulitzer/269/head -> origin/gh/soulitzer/269/head 2025-08-14T21:20:15.2234024Z * [new branch] gh/soulitzer/269/orig -> origin/gh/soulitzer/269/orig 2025-08-14T21:20:15.2236586Z * [new branch] gh/soulitzer/276/base -> origin/gh/soulitzer/276/base 2025-08-14T21:20:15.2238489Z * [new branch] gh/soulitzer/276/head -> origin/gh/soulitzer/276/head 2025-08-14T21:20:15.2240328Z * [new branch] gh/soulitzer/276/orig -> origin/gh/soulitzer/276/orig 2025-08-14T21:20:15.2243088Z * [new branch] gh/soulitzer/287/base -> origin/gh/soulitzer/287/base 2025-08-14T21:20:15.2245102Z * [new branch] gh/soulitzer/287/head -> origin/gh/soulitzer/287/head 2025-08-14T21:20:15.2246867Z * [new branch] gh/soulitzer/287/orig -> origin/gh/soulitzer/287/orig 2025-08-14T21:20:15.2252437Z * [new branch] gh/soulitzer/296/base -> origin/gh/soulitzer/296/base 2025-08-14T21:20:15.2254396Z * [new branch] gh/soulitzer/296/head -> origin/gh/soulitzer/296/head 2025-08-14T21:20:15.2256146Z * [new branch] gh/soulitzer/296/orig -> origin/gh/soulitzer/296/orig 2025-08-14T21:20:15.2258640Z * [new branch] gh/soulitzer/299/base -> origin/gh/soulitzer/299/base 2025-08-14T21:20:15.2260425Z * [new branch] gh/soulitzer/299/head -> origin/gh/soulitzer/299/head 2025-08-14T21:20:15.2262178Z * [new branch] gh/soulitzer/299/orig -> origin/gh/soulitzer/299/orig 2025-08-14T21:20:15.2264613Z * [new branch] gh/soulitzer/300/base -> origin/gh/soulitzer/300/base 2025-08-14T21:20:15.2266548Z * [new branch] gh/soulitzer/300/head -> origin/gh/soulitzer/300/head 2025-08-14T21:20:15.2268278Z * [new branch] gh/soulitzer/300/orig -> origin/gh/soulitzer/300/orig 2025-08-14T21:20:15.2270934Z * [new branch] gh/soulitzer/301/base -> origin/gh/soulitzer/301/base 2025-08-14T21:20:15.2272800Z * [new branch] gh/soulitzer/301/head -> origin/gh/soulitzer/301/head 2025-08-14T21:20:15.2274602Z * [new branch] gh/soulitzer/301/orig -> origin/gh/soulitzer/301/orig 2025-08-14T21:20:15.2277039Z * [new branch] gh/soulitzer/313/base -> origin/gh/soulitzer/313/base 2025-08-14T21:20:15.2278812Z * [new branch] gh/soulitzer/313/head -> origin/gh/soulitzer/313/head 2025-08-14T21:20:15.2280665Z * [new branch] gh/soulitzer/313/orig -> origin/gh/soulitzer/313/orig 2025-08-14T21:20:15.2283244Z * [new branch] gh/soulitzer/319/base -> origin/gh/soulitzer/319/base 2025-08-14T21:20:15.2285096Z * [new branch] gh/soulitzer/319/head -> origin/gh/soulitzer/319/head 2025-08-14T21:20:15.2286877Z * [new branch] gh/soulitzer/319/orig -> origin/gh/soulitzer/319/orig 2025-08-14T21:20:15.2289452Z * [new branch] gh/soulitzer/320/base -> origin/gh/soulitzer/320/base 2025-08-14T21:20:15.2291147Z * [new branch] gh/soulitzer/320/head -> origin/gh/soulitzer/320/head 2025-08-14T21:20:15.2292867Z * [new branch] gh/soulitzer/320/orig -> origin/gh/soulitzer/320/orig 2025-08-14T21:20:15.2295514Z * [new branch] gh/soulitzer/336/base -> origin/gh/soulitzer/336/base 2025-08-14T21:20:15.2297244Z * [new branch] gh/soulitzer/336/head -> origin/gh/soulitzer/336/head 2025-08-14T21:20:15.2298847Z * [new branch] gh/soulitzer/336/orig -> origin/gh/soulitzer/336/orig 2025-08-14T21:20:15.2301479Z * [new branch] gh/soulitzer/347/base -> origin/gh/soulitzer/347/base 2025-08-14T21:20:15.2303199Z * [new branch] gh/soulitzer/347/head -> origin/gh/soulitzer/347/head 2025-08-14T21:20:15.2305052Z * [new branch] gh/soulitzer/347/orig -> origin/gh/soulitzer/347/orig 2025-08-14T21:20:15.2307764Z * [new branch] gh/soulitzer/349/base -> origin/gh/soulitzer/349/base 2025-08-14T21:20:15.2309537Z * [new branch] gh/soulitzer/349/head -> origin/gh/soulitzer/349/head 2025-08-14T21:20:15.2311390Z * [new branch] gh/soulitzer/349/orig -> origin/gh/soulitzer/349/orig 2025-08-14T21:20:15.2313756Z * [new branch] gh/soulitzer/350/base -> origin/gh/soulitzer/350/base 2025-08-14T21:20:15.2315482Z * [new branch] gh/soulitzer/350/head -> origin/gh/soulitzer/350/head 2025-08-14T21:20:15.2317186Z * [new branch] gh/soulitzer/350/orig -> origin/gh/soulitzer/350/orig 2025-08-14T21:20:15.2319714Z * [new branch] gh/soulitzer/351/base -> origin/gh/soulitzer/351/base 2025-08-14T21:20:15.2321505Z * [new branch] gh/soulitzer/351/head -> origin/gh/soulitzer/351/head 2025-08-14T21:20:15.2323462Z * [new branch] gh/soulitzer/351/orig -> origin/gh/soulitzer/351/orig 2025-08-14T21:20:15.2326098Z * [new branch] gh/soulitzer/353/base -> origin/gh/soulitzer/353/base 2025-08-14T21:20:15.2327943Z * [new branch] gh/soulitzer/353/head -> origin/gh/soulitzer/353/head 2025-08-14T21:20:15.2329878Z * [new branch] gh/soulitzer/353/orig -> origin/gh/soulitzer/353/orig 2025-08-14T21:20:15.2333087Z * [new branch] gh/soulitzer/358/base -> origin/gh/soulitzer/358/base 2025-08-14T21:20:15.2334966Z * [new branch] gh/soulitzer/358/head -> origin/gh/soulitzer/358/head 2025-08-14T21:20:15.2336731Z * [new branch] gh/soulitzer/358/orig -> origin/gh/soulitzer/358/orig 2025-08-14T21:20:15.2339661Z * [new branch] gh/soulitzer/359/base -> origin/gh/soulitzer/359/base 2025-08-14T21:20:15.2341472Z * [new branch] gh/soulitzer/359/head -> origin/gh/soulitzer/359/head 2025-08-14T21:20:15.2343332Z * [new branch] gh/soulitzer/359/orig -> origin/gh/soulitzer/359/orig 2025-08-14T21:20:15.2345884Z * [new branch] gh/soulitzer/362/base -> origin/gh/soulitzer/362/base 2025-08-14T21:20:15.2347840Z * [new branch] gh/soulitzer/362/head -> origin/gh/soulitzer/362/head 2025-08-14T21:20:15.2349974Z * [new branch] gh/soulitzer/362/orig -> origin/gh/soulitzer/362/orig 2025-08-14T21:20:15.2353189Z * [new branch] gh/soulitzer/372/base -> origin/gh/soulitzer/372/base 2025-08-14T21:20:15.2355235Z * [new branch] gh/soulitzer/372/head -> origin/gh/soulitzer/372/head 2025-08-14T21:20:15.2357686Z * [new branch] gh/soulitzer/372/orig -> origin/gh/soulitzer/372/orig 2025-08-14T21:20:15.2361164Z * [new branch] gh/swolchok/728/next -> origin/gh/swolchok/728/next 2025-08-14T21:20:15.2364037Z * [new branch] gh/swolchok/758/base -> origin/gh/swolchok/758/base 2025-08-14T21:20:15.2366316Z * [new branch] gh/swolchok/758/head -> origin/gh/swolchok/758/head 2025-08-14T21:20:15.2368600Z * [new branch] gh/swolchok/758/orig -> origin/gh/swolchok/758/orig 2025-08-14T21:20:15.2371865Z * [new branch] gh/swolchok/767/base -> origin/gh/swolchok/767/base 2025-08-14T21:20:15.2374271Z * [new branch] gh/swolchok/767/head -> origin/gh/swolchok/767/head 2025-08-14T21:20:15.2376699Z * [new branch] gh/swolchok/767/orig -> origin/gh/swolchok/767/orig 2025-08-14T21:20:15.2379851Z * [new branch] gh/swolchok/768/base -> origin/gh/swolchok/768/base 2025-08-14T21:20:15.2382068Z * [new branch] gh/swolchok/768/head -> origin/gh/swolchok/768/head 2025-08-14T21:20:15.2384258Z * [new branch] gh/swolchok/768/orig -> origin/gh/swolchok/768/orig 2025-08-14T21:20:15.2388075Z * [new branch] gh/swolchok/769/base -> origin/gh/swolchok/769/base 2025-08-14T21:20:15.2390291Z * [new branch] gh/swolchok/769/head -> origin/gh/swolchok/769/head 2025-08-14T21:20:15.2392807Z * [new branch] gh/swolchok/769/orig -> origin/gh/swolchok/769/orig 2025-08-14T21:20:15.2395800Z * [new branch] gh/swolchok/771/base -> origin/gh/swolchok/771/base 2025-08-14T21:20:15.2398071Z * [new branch] gh/swolchok/771/head -> origin/gh/swolchok/771/head 2025-08-14T21:20:15.2400314Z * [new branch] gh/swolchok/771/orig -> origin/gh/swolchok/771/orig 2025-08-14T21:20:15.2403340Z * [new branch] gh/swolchok/772/base -> origin/gh/swolchok/772/base 2025-08-14T21:20:15.2405706Z * [new branch] gh/swolchok/772/head -> origin/gh/swolchok/772/head 2025-08-14T21:20:15.2407921Z * [new branch] gh/swolchok/772/orig -> origin/gh/swolchok/772/orig 2025-08-14T21:20:15.2411223Z * [new branch] gh/swolchok/773/base -> origin/gh/swolchok/773/base 2025-08-14T21:20:15.2413801Z * [new branch] gh/swolchok/773/head -> origin/gh/swolchok/773/head 2025-08-14T21:20:15.2416117Z * [new branch] gh/swolchok/773/orig -> origin/gh/swolchok/773/orig 2025-08-14T21:20:15.2419230Z * [new branch] gh/swolchok/786/base -> origin/gh/swolchok/786/base 2025-08-14T21:20:15.2421336Z * [new branch] gh/swolchok/786/head -> origin/gh/swolchok/786/head 2025-08-14T21:20:15.2423657Z * [new branch] gh/swolchok/786/orig -> origin/gh/swolchok/786/orig 2025-08-14T21:20:15.2426569Z * [new branch] gh/swolchok/787/base -> origin/gh/swolchok/787/base 2025-08-14T21:20:15.2428687Z * [new branch] gh/swolchok/787/head -> origin/gh/swolchok/787/head 2025-08-14T21:20:15.2431016Z * [new branch] gh/swolchok/787/orig -> origin/gh/swolchok/787/orig 2025-08-14T21:20:15.2434562Z * [new branch] gh/syed-ahmed/2/base -> origin/gh/syed-ahmed/2/base 2025-08-14T21:20:15.2436778Z * [new branch] gh/syed-ahmed/2/head -> origin/gh/syed-ahmed/2/head 2025-08-14T21:20:15.2439001Z * [new branch] gh/syed-ahmed/2/orig -> origin/gh/syed-ahmed/2/orig 2025-08-14T21:20:15.2442020Z * [new branch] gh/syed-ahmed/3/base -> origin/gh/syed-ahmed/3/base 2025-08-14T21:20:15.2444211Z * [new branch] gh/syed-ahmed/3/head -> origin/gh/syed-ahmed/3/head 2025-08-14T21:20:15.2446739Z * [new branch] gh/syed-ahmed/3/orig -> origin/gh/syed-ahmed/3/orig 2025-08-14T21:20:15.2449978Z * [new branch] gh/syed-ahmed/4/base -> origin/gh/syed-ahmed/4/base 2025-08-14T21:20:15.2451986Z * [new branch] gh/syed-ahmed/4/head -> origin/gh/syed-ahmed/4/head 2025-08-14T21:20:15.2454333Z * [new branch] gh/syed-ahmed/4/orig -> origin/gh/syed-ahmed/4/orig 2025-08-14T21:20:15.2457953Z * [new branch] gh/teja-rao/3/base -> origin/gh/teja-rao/3/base 2025-08-14T21:20:15.2460159Z * [new branch] gh/teja-rao/3/head -> origin/gh/teja-rao/3/head 2025-08-14T21:20:15.2462446Z * [new branch] gh/teja-rao/3/orig -> origin/gh/teja-rao/3/orig 2025-08-14T21:20:15.2466479Z * [new branch] gh/tianyu-l/2/base -> origin/gh/tianyu-l/2/base 2025-08-14T21:20:15.2468719Z * [new branch] gh/tianyu-l/2/head -> origin/gh/tianyu-l/2/head 2025-08-14T21:20:15.2470906Z * [new branch] gh/tianyu-l/2/orig -> origin/gh/tianyu-l/2/orig 2025-08-14T21:20:15.2474656Z * [new branch] gh/titaiwangms/1/base -> origin/gh/titaiwangms/1/base 2025-08-14T21:20:15.2476807Z * [new branch] gh/titaiwangms/1/head -> origin/gh/titaiwangms/1/head 2025-08-14T21:20:15.2478981Z * [new branch] gh/titaiwangms/1/orig -> origin/gh/titaiwangms/1/orig 2025-08-14T21:20:15.2481949Z * [new branch] gh/titaiwangms/2/base -> origin/gh/titaiwangms/2/base 2025-08-14T21:20:15.2484147Z * [new branch] gh/titaiwangms/2/head -> origin/gh/titaiwangms/2/head 2025-08-14T21:20:15.2486556Z * [new branch] gh/titaiwangms/2/orig -> origin/gh/titaiwangms/2/orig 2025-08-14T21:20:15.2489533Z * [new branch] gh/titaiwangms/3/base -> origin/gh/titaiwangms/3/base 2025-08-14T21:20:15.2491728Z * [new branch] gh/titaiwangms/3/head -> origin/gh/titaiwangms/3/head 2025-08-14T21:20:15.2493895Z * [new branch] gh/titaiwangms/3/orig -> origin/gh/titaiwangms/3/orig 2025-08-14T21:20:15.2496962Z * [new branch] gh/titaiwangms/4/base -> origin/gh/titaiwangms/4/base 2025-08-14T21:20:15.2499147Z * [new branch] gh/titaiwangms/4/head -> origin/gh/titaiwangms/4/head 2025-08-14T21:20:15.2501528Z * [new branch] gh/titaiwangms/4/orig -> origin/gh/titaiwangms/4/orig 2025-08-14T21:20:15.2504473Z * [new branch] gh/titaiwangms/5/base -> origin/gh/titaiwangms/5/base 2025-08-14T21:20:15.2506601Z * [new branch] gh/titaiwangms/5/head -> origin/gh/titaiwangms/5/head 2025-08-14T21:20:15.2508841Z * [new branch] gh/titaiwangms/5/orig -> origin/gh/titaiwangms/5/orig 2025-08-14T21:20:15.2511841Z * [new branch] gh/titaiwangms/6/base -> origin/gh/titaiwangms/6/base 2025-08-14T21:20:15.2513978Z * [new branch] gh/titaiwangms/6/head -> origin/gh/titaiwangms/6/head 2025-08-14T21:20:15.2516199Z * [new branch] gh/titaiwangms/6/orig -> origin/gh/titaiwangms/6/orig 2025-08-14T21:20:15.2519293Z * [new branch] gh/titaiwangms/7/base -> origin/gh/titaiwangms/7/base 2025-08-14T21:20:15.2522039Z * [new branch] gh/titaiwangms/7/head -> origin/gh/titaiwangms/7/head 2025-08-14T21:20:15.2524163Z * [new branch] gh/titaiwangms/7/orig -> origin/gh/titaiwangms/7/orig 2025-08-14T21:20:15.2527370Z * [new branch] gh/titaiwangms/8/base -> origin/gh/titaiwangms/8/base 2025-08-14T21:20:15.2529421Z * [new branch] gh/titaiwangms/8/head -> origin/gh/titaiwangms/8/head 2025-08-14T21:20:15.2531636Z * [new branch] gh/titaiwangms/8/orig -> origin/gh/titaiwangms/8/orig 2025-08-14T21:20:15.2535401Z * [new branch] gh/tugsbayasgalan/1/base -> origin/gh/tugsbayasgalan/1/base 2025-08-14T21:20:15.2537566Z * [new branch] gh/tugsbayasgalan/1/head -> origin/gh/tugsbayasgalan/1/head 2025-08-14T21:20:15.2539893Z * [new branch] gh/tugsbayasgalan/1/orig -> origin/gh/tugsbayasgalan/1/orig 2025-08-14T21:20:15.2543448Z * [new branch] gh/v0i0/1/base -> origin/gh/v0i0/1/base 2025-08-14T21:20:15.2545582Z * [new branch] gh/v0i0/1/head -> origin/gh/v0i0/1/head 2025-08-14T21:20:15.2547823Z * [new branch] gh/v0i0/1/orig -> origin/gh/v0i0/1/orig 2025-08-14T21:20:15.2561398Z * [new branch] gh/v0i0/2/base -> origin/gh/v0i0/2/base 2025-08-14T21:20:15.2561852Z * [new branch] gh/v0i0/2/head -> origin/gh/v0i0/2/head 2025-08-14T21:20:15.2562057Z * [new branch] gh/v0i0/2/orig -> origin/gh/v0i0/2/orig 2025-08-14T21:20:15.2562243Z * [new branch] gh/v0i0/3/base -> origin/gh/v0i0/3/base 2025-08-14T21:20:15.2562420Z * [new branch] gh/v0i0/3/head -> origin/gh/v0i0/3/head 2025-08-14T21:20:15.2563584Z * [new branch] gh/v0i0/3/orig -> origin/gh/v0i0/3/orig 2025-08-14T21:20:15.2567062Z * [new branch] gh/v0i0/4/base -> origin/gh/v0i0/4/base 2025-08-14T21:20:15.2568594Z * [new branch] gh/v0i0/4/head -> origin/gh/v0i0/4/head 2025-08-14T21:20:15.2570487Z * [new branch] gh/v0i0/4/orig -> origin/gh/v0i0/4/orig 2025-08-14T21:20:15.2572980Z * [new branch] gh/v0i0/5/base -> origin/gh/v0i0/5/base 2025-08-14T21:20:15.2574767Z * [new branch] gh/v0i0/5/head -> origin/gh/v0i0/5/head 2025-08-14T21:20:15.2576530Z * [new branch] gh/v0i0/5/orig -> origin/gh/v0i0/5/orig 2025-08-14T21:20:15.2579658Z * [new branch] gh/v0i0/6/base -> origin/gh/v0i0/6/base 2025-08-14T21:20:15.2581469Z * [new branch] gh/v0i0/6/head -> origin/gh/v0i0/6/head 2025-08-14T21:20:15.2583307Z * [new branch] gh/v0i0/6/orig -> origin/gh/v0i0/6/orig 2025-08-14T21:20:15.2586468Z * [new branch] gh/vkuzo/1/next -> origin/gh/vkuzo/1/next 2025-08-14T21:20:15.2589018Z * [new branch] gh/vkuzo/2/next -> origin/gh/vkuzo/2/next 2025-08-14T21:20:15.2591624Z * [new branch] gh/vkuzo/3/next -> origin/gh/vkuzo/3/next 2025-08-14T21:20:15.2594635Z * [new branch] gh/wconstab/392/base -> origin/gh/wconstab/392/base 2025-08-14T21:20:15.2596525Z * [new branch] gh/wconstab/392/head -> origin/gh/wconstab/392/head 2025-08-14T21:20:15.2598312Z * [new branch] gh/wconstab/392/orig -> origin/gh/wconstab/392/orig 2025-08-14T21:20:15.2600913Z * [new branch] gh/wconstab/419/base -> origin/gh/wconstab/419/base 2025-08-14T21:20:15.2602708Z * [new branch] gh/wconstab/419/head -> origin/gh/wconstab/419/head 2025-08-14T21:20:15.2604589Z * [new branch] gh/wconstab/419/orig -> origin/gh/wconstab/419/orig 2025-08-14T21:20:15.2607857Z * [new branch] gh/wconstab/424/base -> origin/gh/wconstab/424/base 2025-08-14T21:20:15.2609691Z * [new branch] gh/wconstab/424/head -> origin/gh/wconstab/424/head 2025-08-14T21:20:15.2611568Z * [new branch] gh/wconstab/424/orig -> origin/gh/wconstab/424/orig 2025-08-14T21:20:15.2614057Z * [new branch] gh/wconstab/425/base -> origin/gh/wconstab/425/base 2025-08-14T21:20:15.2615954Z * [new branch] gh/wconstab/425/head -> origin/gh/wconstab/425/head 2025-08-14T21:20:15.2617856Z * [new branch] gh/wconstab/425/orig -> origin/gh/wconstab/425/orig 2025-08-14T21:20:15.2620434Z * [new branch] gh/wconstab/426/base -> origin/gh/wconstab/426/base 2025-08-14T21:20:15.2622114Z * [new branch] gh/wconstab/426/head -> origin/gh/wconstab/426/head 2025-08-14T21:20:15.2623948Z * [new branch] gh/wconstab/426/orig -> origin/gh/wconstab/426/orig 2025-08-14T21:20:15.2626403Z * [new branch] gh/wconstab/427/base -> origin/gh/wconstab/427/base 2025-08-14T21:20:15.2628327Z * [new branch] gh/wconstab/427/head -> origin/gh/wconstab/427/head 2025-08-14T21:20:15.2630237Z * [new branch] gh/wconstab/427/orig -> origin/gh/wconstab/427/orig 2025-08-14T21:20:15.2632798Z * [new branch] gh/wconstab/428/base -> origin/gh/wconstab/428/base 2025-08-14T21:20:15.2634703Z * [new branch] gh/wconstab/428/head -> origin/gh/wconstab/428/head 2025-08-14T21:20:15.2636952Z * [new branch] gh/wconstab/428/orig -> origin/gh/wconstab/428/orig 2025-08-14T21:20:15.2639676Z * [new branch] gh/wconstab/429/base -> origin/gh/wconstab/429/base 2025-08-14T21:20:15.2641644Z * [new branch] gh/wconstab/429/head -> origin/gh/wconstab/429/head 2025-08-14T21:20:15.2643853Z * [new branch] gh/wconstab/429/orig -> origin/gh/wconstab/429/orig 2025-08-14T21:20:15.2646312Z * [new branch] gh/wconstab/430/base -> origin/gh/wconstab/430/base 2025-08-14T21:20:15.2648375Z * [new branch] gh/wconstab/430/head -> origin/gh/wconstab/430/head 2025-08-14T21:20:15.2650271Z * [new branch] gh/wconstab/430/orig -> origin/gh/wconstab/430/orig 2025-08-14T21:20:15.2652942Z * [new branch] gh/wconstab/431/base -> origin/gh/wconstab/431/base 2025-08-14T21:20:15.2654866Z * [new branch] gh/wconstab/431/head -> origin/gh/wconstab/431/head 2025-08-14T21:20:15.2656830Z * [new branch] gh/wconstab/431/orig -> origin/gh/wconstab/431/orig 2025-08-14T21:20:15.2659511Z * [new branch] gh/wconstab/432/base -> origin/gh/wconstab/432/base 2025-08-14T21:20:15.2661338Z * [new branch] gh/wconstab/432/head -> origin/gh/wconstab/432/head 2025-08-14T21:20:15.2663149Z * [new branch] gh/wconstab/432/orig -> origin/gh/wconstab/432/orig 2025-08-14T21:20:15.2665742Z * [new branch] gh/wconstab/433/base -> origin/gh/wconstab/433/base 2025-08-14T21:20:15.2667825Z * [new branch] gh/wconstab/433/head -> origin/gh/wconstab/433/head 2025-08-14T21:20:15.2669456Z * [new branch] gh/wconstab/433/orig -> origin/gh/wconstab/433/orig 2025-08-14T21:20:15.2671796Z * [new branch] gh/wconstab/434/base -> origin/gh/wconstab/434/base 2025-08-14T21:20:15.2673727Z * [new branch] gh/wconstab/434/head -> origin/gh/wconstab/434/head 2025-08-14T21:20:15.2675571Z * [new branch] gh/wconstab/434/orig -> origin/gh/wconstab/434/orig 2025-08-14T21:20:15.2678123Z * [new branch] gh/wconstab/435/base -> origin/gh/wconstab/435/base 2025-08-14T21:20:15.2679910Z * [new branch] gh/wconstab/435/head -> origin/gh/wconstab/435/head 2025-08-14T21:20:15.2681744Z * [new branch] gh/wconstab/435/orig -> origin/gh/wconstab/435/orig 2025-08-14T21:20:15.2684326Z * [new branch] gh/wconstab/436/base -> origin/gh/wconstab/436/base 2025-08-14T21:20:15.2686494Z * [new branch] gh/wconstab/436/head -> origin/gh/wconstab/436/head 2025-08-14T21:20:15.2688286Z * [new branch] gh/wconstab/436/orig -> origin/gh/wconstab/436/orig 2025-08-14T21:20:15.2690804Z * [new branch] gh/wconstab/437/base -> origin/gh/wconstab/437/base 2025-08-14T21:20:15.2692721Z * [new branch] gh/wconstab/437/head -> origin/gh/wconstab/437/head 2025-08-14T21:20:15.2694570Z * [new branch] gh/wconstab/437/orig -> origin/gh/wconstab/437/orig 2025-08-14T21:20:15.2697454Z * [new branch] gh/wconstab/438/base -> origin/gh/wconstab/438/base 2025-08-14T21:20:15.2699238Z * [new branch] gh/wconstab/438/head -> origin/gh/wconstab/438/head 2025-08-14T21:20:15.2701008Z * [new branch] gh/wconstab/438/orig -> origin/gh/wconstab/438/orig 2025-08-14T21:20:15.2703458Z * [new branch] gh/wconstab/439/base -> origin/gh/wconstab/439/base 2025-08-14T21:20:15.2705439Z * [new branch] gh/wconstab/439/head -> origin/gh/wconstab/439/head 2025-08-14T21:20:15.2707257Z * [new branch] gh/wconstab/439/orig -> origin/gh/wconstab/439/orig 2025-08-14T21:20:15.2709851Z * [new branch] gh/wconstab/440/base -> origin/gh/wconstab/440/base 2025-08-14T21:20:15.2711958Z * [new branch] gh/wconstab/440/head -> origin/gh/wconstab/440/head 2025-08-14T21:20:15.2713835Z * [new branch] gh/wconstab/440/orig -> origin/gh/wconstab/440/orig 2025-08-14T21:20:15.2716611Z * [new branch] gh/wconstab/441/base -> origin/gh/wconstab/441/base 2025-08-14T21:20:15.2718957Z * [new branch] gh/wconstab/441/head -> origin/gh/wconstab/441/head 2025-08-14T21:20:15.2720779Z * [new branch] gh/wconstab/441/orig -> origin/gh/wconstab/441/orig 2025-08-14T21:20:15.2723255Z * [new branch] gh/wconstab/442/base -> origin/gh/wconstab/442/base 2025-08-14T21:20:15.2725198Z * [new branch] gh/wconstab/442/head -> origin/gh/wconstab/442/head 2025-08-14T21:20:15.2726972Z * [new branch] gh/wconstab/442/orig -> origin/gh/wconstab/442/orig 2025-08-14T21:20:15.2730093Z * [new branch] gh/weifengpy/27/base -> origin/gh/weifengpy/27/base 2025-08-14T21:20:15.2732064Z * [new branch] gh/weifengpy/27/head -> origin/gh/weifengpy/27/head 2025-08-14T21:20:15.2733701Z * [new branch] gh/weifengpy/27/orig -> origin/gh/weifengpy/27/orig 2025-08-14T21:20:15.2736784Z * [new branch] gh/weifengpy/30/base -> origin/gh/weifengpy/30/base 2025-08-14T21:20:15.2738561Z * [new branch] gh/weifengpy/30/head -> origin/gh/weifengpy/30/head 2025-08-14T21:20:15.2740407Z * [new branch] gh/weifengpy/30/orig -> origin/gh/weifengpy/30/orig 2025-08-14T21:20:15.2742951Z * [new branch] gh/weifengpy/31/base -> origin/gh/weifengpy/31/base 2025-08-14T21:20:15.2744649Z * [new branch] gh/weifengpy/31/head -> origin/gh/weifengpy/31/head 2025-08-14T21:20:15.2746764Z * [new branch] gh/weifengpy/31/orig -> origin/gh/weifengpy/31/orig 2025-08-14T21:20:15.2751012Z * [new branch] gh/weifengpy/32/base -> origin/gh/weifengpy/32/base 2025-08-14T21:20:15.2752851Z * [new branch] gh/weifengpy/32/head -> origin/gh/weifengpy/32/head 2025-08-14T21:20:15.2754722Z * [new branch] gh/weifengpy/32/orig -> origin/gh/weifengpy/32/orig 2025-08-14T21:20:15.2757054Z * [new branch] gh/weifengpy/33/base -> origin/gh/weifengpy/33/base 2025-08-14T21:20:15.2759276Z * [new branch] gh/weifengpy/33/head -> origin/gh/weifengpy/33/head 2025-08-14T21:20:15.2760716Z * [new branch] gh/weifengpy/33/orig -> origin/gh/weifengpy/33/orig 2025-08-14T21:20:15.2764053Z * [new branch] gh/williamwen42/196/base -> origin/gh/williamwen42/196/base 2025-08-14T21:20:15.2766208Z * [new branch] gh/williamwen42/196/head -> origin/gh/williamwen42/196/head 2025-08-14T21:20:15.2767998Z * [new branch] gh/williamwen42/196/orig -> origin/gh/williamwen42/196/orig 2025-08-14T21:20:15.2770557Z * [new branch] gh/williamwen42/209/base -> origin/gh/williamwen42/209/base 2025-08-14T21:20:15.2772323Z * [new branch] gh/williamwen42/209/head -> origin/gh/williamwen42/209/head 2025-08-14T21:20:15.2774403Z * [new branch] gh/williamwen42/209/orig -> origin/gh/williamwen42/209/orig 2025-08-14T21:20:15.2776819Z * [new branch] gh/williamwen42/250/base -> origin/gh/williamwen42/250/base 2025-08-14T21:20:15.2781658Z * [new branch] gh/williamwen42/250/head -> origin/gh/williamwen42/250/head 2025-08-14T21:20:15.2783327Z * [new branch] gh/williamwen42/250/orig -> origin/gh/williamwen42/250/orig 2025-08-14T21:20:15.2783562Z * [new branch] gh/williamwen42/252/base -> origin/gh/williamwen42/252/base 2025-08-14T21:20:15.2784369Z * [new branch] gh/williamwen42/252/head -> origin/gh/williamwen42/252/head 2025-08-14T21:20:15.2786426Z * [new branch] gh/williamwen42/252/orig -> origin/gh/williamwen42/252/orig 2025-08-14T21:20:15.2789458Z * [new branch] gh/williamwen42/256/base -> origin/gh/williamwen42/256/base 2025-08-14T21:20:15.2790918Z * [new branch] gh/williamwen42/256/head -> origin/gh/williamwen42/256/head 2025-08-14T21:20:15.2792693Z * [new branch] gh/williamwen42/256/orig -> origin/gh/williamwen42/256/orig 2025-08-14T21:20:15.2795252Z * [new branch] gh/williamwen42/258/base -> origin/gh/williamwen42/258/base 2025-08-14T21:20:15.2797092Z * [new branch] gh/williamwen42/258/head -> origin/gh/williamwen42/258/head 2025-08-14T21:20:15.2798870Z * [new branch] gh/williamwen42/258/orig -> origin/gh/williamwen42/258/orig 2025-08-14T21:20:15.2801422Z * [new branch] gh/williamwen42/260/base -> origin/gh/williamwen42/260/base 2025-08-14T21:20:15.2803304Z * [new branch] gh/williamwen42/260/head -> origin/gh/williamwen42/260/head 2025-08-14T21:20:15.2805323Z * [new branch] gh/williamwen42/260/orig -> origin/gh/williamwen42/260/orig 2025-08-14T21:20:15.2808337Z * [new branch] gh/williamwen42/261/base -> origin/gh/williamwen42/261/base 2025-08-14T21:20:15.2810131Z * [new branch] gh/williamwen42/261/head -> origin/gh/williamwen42/261/head 2025-08-14T21:20:15.2811923Z * [new branch] gh/williamwen42/261/orig -> origin/gh/williamwen42/261/orig 2025-08-14T21:20:15.2814790Z * [new branch] gh/williamwen42/262/base -> origin/gh/williamwen42/262/base 2025-08-14T21:20:15.2816682Z * [new branch] gh/williamwen42/262/head -> origin/gh/williamwen42/262/head 2025-08-14T21:20:15.2818366Z * [new branch] gh/williamwen42/262/orig -> origin/gh/williamwen42/262/orig 2025-08-14T21:20:15.2820853Z * [new branch] gh/williamwen42/263/base -> origin/gh/williamwen42/263/base 2025-08-14T21:20:15.2822647Z * [new branch] gh/williamwen42/263/head -> origin/gh/williamwen42/263/head 2025-08-14T21:20:15.2824402Z * [new branch] gh/williamwen42/263/orig -> origin/gh/williamwen42/263/orig 2025-08-14T21:20:15.2826948Z * [new branch] gh/williamwen42/264/base -> origin/gh/williamwen42/264/base 2025-08-14T21:20:15.2828842Z * [new branch] gh/williamwen42/264/head -> origin/gh/williamwen42/264/head 2025-08-14T21:20:15.2830677Z * [new branch] gh/williamwen42/264/orig -> origin/gh/williamwen42/264/orig 2025-08-14T21:20:15.2833607Z * [new branch] gh/williamwen42/265/base -> origin/gh/williamwen42/265/base 2025-08-14T21:20:15.2835274Z * [new branch] gh/williamwen42/265/head -> origin/gh/williamwen42/265/head 2025-08-14T21:20:15.2837005Z * [new branch] gh/williamwen42/265/orig -> origin/gh/williamwen42/265/orig 2025-08-14T21:20:15.2839930Z * [new branch] gh/williamwen42/266/base -> origin/gh/williamwen42/266/base 2025-08-14T21:20:15.2841901Z * [new branch] gh/williamwen42/266/head -> origin/gh/williamwen42/266/head 2025-08-14T21:20:15.2843562Z * [new branch] gh/williamwen42/266/orig -> origin/gh/williamwen42/266/orig 2025-08-14T21:20:15.2846215Z * [new branch] gh/williamwen42/267/base -> origin/gh/williamwen42/267/base 2025-08-14T21:20:15.2848015Z * [new branch] gh/williamwen42/267/head -> origin/gh/williamwen42/267/head 2025-08-14T21:20:15.2850103Z * [new branch] gh/williamwen42/267/orig -> origin/gh/williamwen42/267/orig 2025-08-14T21:20:15.2852655Z * [new branch] gh/williamwen42/268/base -> origin/gh/williamwen42/268/base 2025-08-14T21:20:15.2854462Z * [new branch] gh/williamwen42/268/head -> origin/gh/williamwen42/268/head 2025-08-14T21:20:15.2856252Z * [new branch] gh/williamwen42/268/orig -> origin/gh/williamwen42/268/orig 2025-08-14T21:20:15.2858565Z * [new branch] gh/williamwen42/269/base -> origin/gh/williamwen42/269/base 2025-08-14T21:20:15.2860464Z * [new branch] gh/williamwen42/269/head -> origin/gh/williamwen42/269/head 2025-08-14T21:20:15.2862235Z * [new branch] gh/williamwen42/269/orig -> origin/gh/williamwen42/269/orig 2025-08-14T21:20:15.2865586Z * [new branch] gh/williamwen42/270/base -> origin/gh/williamwen42/270/base 2025-08-14T21:20:15.2867378Z * [new branch] gh/williamwen42/270/head -> origin/gh/williamwen42/270/head 2025-08-14T21:20:15.2869189Z * [new branch] gh/williamwen42/270/orig -> origin/gh/williamwen42/270/orig 2025-08-14T21:20:15.2871878Z * [new branch] gh/williamwen42/271/base -> origin/gh/williamwen42/271/base 2025-08-14T21:20:15.2873648Z * [new branch] gh/williamwen42/271/head -> origin/gh/williamwen42/271/head 2025-08-14T21:20:15.2875568Z * [new branch] gh/williamwen42/271/orig -> origin/gh/williamwen42/271/orig 2025-08-14T21:20:15.2878166Z * [new branch] gh/williamwen42/272/base -> origin/gh/williamwen42/272/base 2025-08-14T21:20:15.2880035Z * [new branch] gh/williamwen42/272/head -> origin/gh/williamwen42/272/head 2025-08-14T21:20:15.2881857Z * [new branch] gh/williamwen42/272/orig -> origin/gh/williamwen42/272/orig 2025-08-14T21:20:15.2884605Z * [new branch] gh/williamwen42/273/base -> origin/gh/williamwen42/273/base 2025-08-14T21:20:15.2886611Z * [new branch] gh/williamwen42/273/head -> origin/gh/williamwen42/273/head 2025-08-14T21:20:15.2888559Z * [new branch] gh/williamwen42/273/orig -> origin/gh/williamwen42/273/orig 2025-08-14T21:20:15.2891145Z * [new branch] gh/williamwen42/274/base -> origin/gh/williamwen42/274/base 2025-08-14T21:20:15.2893374Z * [new branch] gh/williamwen42/274/head -> origin/gh/williamwen42/274/head 2025-08-14T21:20:15.2894732Z * [new branch] gh/williamwen42/274/orig -> origin/gh/williamwen42/274/orig 2025-08-14T21:20:15.2897186Z * [new branch] gh/williamwen42/275/base -> origin/gh/williamwen42/275/base 2025-08-14T21:20:15.2899108Z * [new branch] gh/williamwen42/275/head -> origin/gh/williamwen42/275/head 2025-08-14T21:20:15.2901519Z * [new branch] gh/williamwen42/276/base -> origin/gh/williamwen42/276/base 2025-08-14T21:20:15.2903250Z * [new branch] gh/williamwen42/276/head -> origin/gh/williamwen42/276/head 2025-08-14T21:20:15.2905079Z * [new branch] gh/williamwen42/276/orig -> origin/gh/williamwen42/276/orig 2025-08-14T21:20:15.2907818Z * [new branch] gh/williamwen42/277/base -> origin/gh/williamwen42/277/base 2025-08-14T21:20:15.2909623Z * [new branch] gh/williamwen42/277/head -> origin/gh/williamwen42/277/head 2025-08-14T21:20:15.2911388Z * [new branch] gh/williamwen42/277/orig -> origin/gh/williamwen42/277/orig 2025-08-14T21:20:15.2913919Z * [new branch] gh/williamwen42/278/base -> origin/gh/williamwen42/278/base 2025-08-14T21:20:15.2915754Z * [new branch] gh/williamwen42/278/head -> origin/gh/williamwen42/278/head 2025-08-14T21:20:15.2917589Z * [new branch] gh/williamwen42/278/orig -> origin/gh/williamwen42/278/orig 2025-08-14T21:20:15.2919996Z * [new branch] gh/williamwen42/279/base -> origin/gh/williamwen42/279/base 2025-08-14T21:20:15.2921773Z * [new branch] gh/williamwen42/279/head -> origin/gh/williamwen42/279/head 2025-08-14T21:20:15.2923608Z * [new branch] gh/williamwen42/279/orig -> origin/gh/williamwen42/279/orig 2025-08-14T21:20:15.2926931Z * [new branch] gh/xmfan/169/base -> origin/gh/xmfan/169/base 2025-08-14T21:20:15.2928734Z * [new branch] gh/xmfan/169/head -> origin/gh/xmfan/169/head 2025-08-14T21:20:15.2931134Z * [new branch] gh/xmfan/170/base -> origin/gh/xmfan/170/base 2025-08-14T21:20:15.2932842Z * [new branch] gh/xmfan/170/head -> origin/gh/xmfan/170/head 2025-08-14T21:20:15.2935848Z * [new branch] gh/xmfan/18/base -> origin/gh/xmfan/18/base 2025-08-14T21:20:15.2937714Z * [new branch] gh/xmfan/18/head -> origin/gh/xmfan/18/head 2025-08-14T21:20:15.2940240Z * [new branch] gh/xmfan/228/base -> origin/gh/xmfan/228/base 2025-08-14T21:20:15.2941972Z * [new branch] gh/xmfan/228/head -> origin/gh/xmfan/228/head 2025-08-14T21:20:15.2944395Z * [new branch] gh/xmfan/228/orig -> origin/gh/xmfan/228/orig 2025-08-14T21:20:15.2947319Z * [new branch] gh/xmfan/229/base -> origin/gh/xmfan/229/base 2025-08-14T21:20:15.2949043Z * [new branch] gh/xmfan/229/head -> origin/gh/xmfan/229/head 2025-08-14T21:20:15.2950781Z * [new branch] gh/xmfan/229/orig -> origin/gh/xmfan/229/orig 2025-08-14T21:20:15.2953267Z * [new branch] gh/xmfan/237/base -> origin/gh/xmfan/237/base 2025-08-14T21:20:15.2955079Z * [new branch] gh/xmfan/237/head -> origin/gh/xmfan/237/head 2025-08-14T21:20:15.2957324Z * [new branch] gh/xmfan/237/orig -> origin/gh/xmfan/237/orig 2025-08-14T21:20:15.2959307Z * [new branch] gh/xmfan/244/base -> origin/gh/xmfan/244/base 2025-08-14T21:20:15.2961158Z * [new branch] gh/xmfan/244/head -> origin/gh/xmfan/244/head 2025-08-14T21:20:15.2963196Z * [new branch] gh/xmfan/244/orig -> origin/gh/xmfan/244/orig 2025-08-14T21:20:15.2965793Z * [new branch] gh/xmfan/246/base -> origin/gh/xmfan/246/base 2025-08-14T21:20:15.2967568Z * [new branch] gh/xmfan/246/head -> origin/gh/xmfan/246/head 2025-08-14T21:20:15.2969334Z * [new branch] gh/xmfan/246/orig -> origin/gh/xmfan/246/orig 2025-08-14T21:20:15.2971862Z * [new branch] gh/xmfan/253/base -> origin/gh/xmfan/253/base 2025-08-14T21:20:15.2973712Z * [new branch] gh/xmfan/253/head -> origin/gh/xmfan/253/head 2025-08-14T21:20:15.2975506Z * [new branch] gh/xmfan/253/orig -> origin/gh/xmfan/253/orig 2025-08-14T21:20:15.2978410Z * [new branch] gh/xmfan/254/base -> origin/gh/xmfan/254/base 2025-08-14T21:20:15.2980202Z * [new branch] gh/xmfan/254/head -> origin/gh/xmfan/254/head 2025-08-14T21:20:15.2981999Z * [new branch] gh/xmfan/254/orig -> origin/gh/xmfan/254/orig 2025-08-14T21:20:15.2984462Z * [new branch] gh/xmfan/260/base -> origin/gh/xmfan/260/base 2025-08-14T21:20:15.2986764Z * [new branch] gh/xmfan/260/head -> origin/gh/xmfan/260/head 2025-08-14T21:20:15.2988922Z * [new branch] gh/xmfan/260/orig -> origin/gh/xmfan/260/orig 2025-08-14T21:20:15.2991234Z * [new branch] gh/xmfan/262/base -> origin/gh/xmfan/262/base 2025-08-14T21:20:15.2993032Z * [new branch] gh/xmfan/262/head -> origin/gh/xmfan/262/head 2025-08-14T21:20:15.2994800Z * [new branch] gh/xmfan/262/orig -> origin/gh/xmfan/262/orig 2025-08-14T21:20:15.2997363Z * [new branch] gh/xmfan/263/base -> origin/gh/xmfan/263/base 2025-08-14T21:20:15.2999108Z * [new branch] gh/xmfan/263/head -> origin/gh/xmfan/263/head 2025-08-14T21:20:15.3000912Z * [new branch] gh/xmfan/263/orig -> origin/gh/xmfan/263/orig 2025-08-14T21:20:15.3003836Z * [new branch] gh/xmfan/264/base -> origin/gh/xmfan/264/base 2025-08-14T21:20:15.3005799Z * [new branch] gh/xmfan/264/head -> origin/gh/xmfan/264/head 2025-08-14T21:20:15.3008240Z * [new branch] gh/xmfan/264/orig -> origin/gh/xmfan/264/orig 2025-08-14T21:20:15.3010744Z * [new branch] gh/xmfan/268/base -> origin/gh/xmfan/268/base 2025-08-14T21:20:15.3012496Z * [new branch] gh/xmfan/268/head -> origin/gh/xmfan/268/head 2025-08-14T21:20:15.3014395Z * [new branch] gh/xmfan/268/orig -> origin/gh/xmfan/268/orig 2025-08-14T21:20:15.3016889Z * [new branch] gh/xmfan/269/base -> origin/gh/xmfan/269/base 2025-08-14T21:20:15.3021931Z * [new branch] gh/xmfan/269/head -> origin/gh/xmfan/269/head 2025-08-14T21:20:15.3023830Z * [new branch] gh/xmfan/269/orig -> origin/gh/xmfan/269/orig 2025-08-14T21:20:15.3026234Z * [new branch] gh/xmfan/270/base -> origin/gh/xmfan/270/base 2025-08-14T21:20:15.3028093Z * [new branch] gh/xmfan/270/head -> origin/gh/xmfan/270/head 2025-08-14T21:20:15.3029971Z * [new branch] gh/xmfan/270/orig -> origin/gh/xmfan/270/orig 2025-08-14T21:20:15.3033007Z * [new branch] gh/xmfan/271/base -> origin/gh/xmfan/271/base 2025-08-14T21:20:15.3034793Z * [new branch] gh/xmfan/271/head -> origin/gh/xmfan/271/head 2025-08-14T21:20:15.3036586Z * [new branch] gh/xmfan/271/orig -> origin/gh/xmfan/271/orig 2025-08-14T21:20:15.3039086Z * [new branch] gh/xmfan/272/base -> origin/gh/xmfan/272/base 2025-08-14T21:20:15.3040856Z * [new branch] gh/xmfan/272/head -> origin/gh/xmfan/272/head 2025-08-14T21:20:15.3042906Z * [new branch] gh/xmfan/272/orig -> origin/gh/xmfan/272/orig 2025-08-14T21:20:15.3045526Z * [new branch] gh/xmfan/273/base -> origin/gh/xmfan/273/base 2025-08-14T21:20:15.3047372Z * [new branch] gh/xmfan/273/head -> origin/gh/xmfan/273/head 2025-08-14T21:20:15.3049439Z * [new branch] gh/xmfan/273/orig -> origin/gh/xmfan/273/orig 2025-08-14T21:20:15.3052010Z * [new branch] gh/xmfan/274/base -> origin/gh/xmfan/274/base 2025-08-14T21:20:15.3053836Z * [new branch] gh/xmfan/274/head -> origin/gh/xmfan/274/head 2025-08-14T21:20:15.3055637Z * [new branch] gh/xmfan/274/orig -> origin/gh/xmfan/274/orig 2025-08-14T21:20:15.3058180Z * [new branch] gh/xmfan/275/base -> origin/gh/xmfan/275/base 2025-08-14T21:20:15.3059969Z * [new branch] gh/xmfan/275/head -> origin/gh/xmfan/275/head 2025-08-14T21:20:15.3061787Z * [new branch] gh/xmfan/275/orig -> origin/gh/xmfan/275/orig 2025-08-14T21:20:15.3064361Z * [new branch] gh/xmfan/276/base -> origin/gh/xmfan/276/base 2025-08-14T21:20:15.3066218Z * [new branch] gh/xmfan/276/head -> origin/gh/xmfan/276/head 2025-08-14T21:20:15.3068153Z * [new branch] gh/xmfan/276/orig -> origin/gh/xmfan/276/orig 2025-08-14T21:20:15.3070706Z * [new branch] gh/xmfan/277/base -> origin/gh/xmfan/277/base 2025-08-14T21:20:15.3072541Z * [new branch] gh/xmfan/277/head -> origin/gh/xmfan/277/head 2025-08-14T21:20:15.3074340Z * [new branch] gh/xmfan/277/orig -> origin/gh/xmfan/277/orig 2025-08-14T21:20:15.3077388Z * [new branch] gh/xuanzhang816/12/base -> origin/gh/xuanzhang816/12/base 2025-08-14T21:20:15.3079134Z * [new branch] gh/xuanzhang816/12/head -> origin/gh/xuanzhang816/12/head 2025-08-14T21:20:15.3080922Z * [new branch] gh/xuanzhang816/12/orig -> origin/gh/xuanzhang816/12/orig 2025-08-14T21:20:15.3083393Z * [new branch] gh/xuanzhang816/14/base -> origin/gh/xuanzhang816/14/base 2025-08-14T21:20:15.3085422Z * [new branch] gh/xuanzhang816/14/head -> origin/gh/xuanzhang816/14/head 2025-08-14T21:20:15.3087201Z * [new branch] gh/xuanzhang816/14/orig -> origin/gh/xuanzhang816/14/orig 2025-08-14T21:20:15.3090116Z * [new branch] gh/xuanzhang816/18/base -> origin/gh/xuanzhang816/18/base 2025-08-14T21:20:15.3092114Z * [new branch] gh/xuanzhang816/18/head -> origin/gh/xuanzhang816/18/head 2025-08-14T21:20:15.3094017Z * [new branch] gh/xuanzhang816/18/orig -> origin/gh/xuanzhang816/18/orig 2025-08-14T21:20:15.3096663Z * [new branch] gh/xuanzhang816/19/base -> origin/gh/xuanzhang816/19/base 2025-08-14T21:20:15.3098427Z * [new branch] gh/xuanzhang816/19/head -> origin/gh/xuanzhang816/19/head 2025-08-14T21:20:15.3100171Z * [new branch] gh/xuanzhang816/19/orig -> origin/gh/xuanzhang816/19/orig 2025-08-14T21:20:15.3103216Z * [new branch] gh/xuanzhang816/20/base -> origin/gh/xuanzhang816/20/base 2025-08-14T21:20:15.3105023Z * [new branch] gh/xuanzhang816/20/head -> origin/gh/xuanzhang816/20/head 2025-08-14T21:20:15.3106830Z * [new branch] gh/xuanzhang816/20/orig -> origin/gh/xuanzhang816/20/orig 2025-08-14T21:20:15.3109358Z * [new branch] gh/xuanzhang816/21/base -> origin/gh/xuanzhang816/21/base 2025-08-14T21:20:15.3111295Z * [new branch] gh/xuanzhang816/21/head -> origin/gh/xuanzhang816/21/head 2025-08-14T21:20:15.3113053Z * [new branch] gh/xuanzhang816/21/orig -> origin/gh/xuanzhang816/21/orig 2025-08-14T21:20:15.3116117Z * [new branch] gh/xuanzhang816/22/base -> origin/gh/xuanzhang816/22/base 2025-08-14T21:20:15.3118128Z * [new branch] gh/xuanzhang816/22/head -> origin/gh/xuanzhang816/22/head 2025-08-14T21:20:15.3119779Z * [new branch] gh/xuanzhang816/22/orig -> origin/gh/xuanzhang816/22/orig 2025-08-14T21:20:15.3122223Z * [new branch] gh/xuanzhang816/23/base -> origin/gh/xuanzhang816/23/base 2025-08-14T21:20:15.3124012Z * [new branch] gh/xuanzhang816/23/head -> origin/gh/xuanzhang816/23/head 2025-08-14T21:20:15.3126072Z * [new branch] gh/xuanzhang816/23/orig -> origin/gh/xuanzhang816/23/orig 2025-08-14T21:20:15.3129040Z * [new branch] gh/xuanzhang816/24/base -> origin/gh/xuanzhang816/24/base 2025-08-14T21:20:15.3130820Z * [new branch] gh/xuanzhang816/24/head -> origin/gh/xuanzhang816/24/head 2025-08-14T21:20:15.3132608Z * [new branch] gh/xuanzhang816/24/orig -> origin/gh/xuanzhang816/24/orig 2025-08-14T21:20:15.3135709Z * [new branch] gh/yanbing-j/11/base -> origin/gh/yanbing-j/11/base 2025-08-14T21:20:15.3137561Z * [new branch] gh/yanbing-j/11/head -> origin/gh/yanbing-j/11/head 2025-08-14T21:20:15.3139464Z * [new branch] gh/yanbing-j/11/orig -> origin/gh/yanbing-j/11/orig 2025-08-14T21:20:15.3142021Z * [new branch] gh/yanbing-j/12/base -> origin/gh/yanbing-j/12/base 2025-08-14T21:20:15.3143817Z * [new branch] gh/yanbing-j/12/head -> origin/gh/yanbing-j/12/head 2025-08-14T21:20:15.3145628Z * [new branch] gh/yanbing-j/12/orig -> origin/gh/yanbing-j/12/orig 2025-08-14T21:20:15.3149665Z * [new branch] gh/yanbing-j/13/base -> origin/gh/yanbing-j/13/base 2025-08-14T21:20:15.3151563Z * [new branch] gh/yanbing-j/13/head -> origin/gh/yanbing-j/13/head 2025-08-14T21:20:15.3153335Z * [new branch] gh/yanbing-j/13/orig -> origin/gh/yanbing-j/13/orig 2025-08-14T21:20:15.3155837Z * [new branch] gh/yanbing-j/14/base -> origin/gh/yanbing-j/14/base 2025-08-14T21:20:15.3157678Z * [new branch] gh/yanbing-j/14/head -> origin/gh/yanbing-j/14/head 2025-08-14T21:20:15.3159456Z * [new branch] gh/yanbing-j/14/orig -> origin/gh/yanbing-j/14/orig 2025-08-14T21:20:15.3161964Z * [new branch] gh/yanbing-j/15/base -> origin/gh/yanbing-j/15/base 2025-08-14T21:20:15.3163779Z * [new branch] gh/yanbing-j/15/head -> origin/gh/yanbing-j/15/head 2025-08-14T21:20:15.3165812Z * [new branch] gh/yanbing-j/15/orig -> origin/gh/yanbing-j/15/orig 2025-08-14T21:20:15.3168326Z * [new branch] gh/yanbing-j/18/base -> origin/gh/yanbing-j/18/base 2025-08-14T21:20:15.3170176Z * [new branch] gh/yanbing-j/18/head -> origin/gh/yanbing-j/18/head 2025-08-14T21:20:15.3171992Z * [new branch] gh/yanbing-j/18/orig -> origin/gh/yanbing-j/18/orig 2025-08-14T21:20:15.3174479Z * [new branch] gh/yanbing-j/19/base -> origin/gh/yanbing-j/19/base 2025-08-14T21:20:15.3176283Z * [new branch] gh/yanbing-j/19/head -> origin/gh/yanbing-j/19/head 2025-08-14T21:20:15.3178073Z * [new branch] gh/yanbing-j/19/orig -> origin/gh/yanbing-j/19/orig 2025-08-14T21:20:15.3180494Z * [new branch] gh/yanbing-j/20/base -> origin/gh/yanbing-j/20/base 2025-08-14T21:20:15.3182308Z * [new branch] gh/yanbing-j/20/head -> origin/gh/yanbing-j/20/head 2025-08-14T21:20:15.3184072Z * [new branch] gh/yanbing-j/20/orig -> origin/gh/yanbing-j/20/orig 2025-08-14T21:20:15.3186647Z * [new branch] gh/yanbing-j/21/base -> origin/gh/yanbing-j/21/base 2025-08-14T21:20:15.3188511Z * [new branch] gh/yanbing-j/21/head -> origin/gh/yanbing-j/21/head 2025-08-14T21:20:15.3190951Z * [new branch] gh/yanbing-j/22/base -> origin/gh/yanbing-j/22/base 2025-08-14T21:20:15.3192998Z * [new branch] gh/yanbing-j/22/head -> origin/gh/yanbing-j/22/head 2025-08-14T21:20:15.3194719Z * [new branch] gh/yanbing-j/22/orig -> origin/gh/yanbing-j/22/orig 2025-08-14T21:20:15.3197140Z * [new branch] gh/yanbing-j/23/base -> origin/gh/yanbing-j/23/base 2025-08-14T21:20:15.3198880Z * [new branch] gh/yanbing-j/23/head -> origin/gh/yanbing-j/23/head 2025-08-14T21:20:15.3200690Z * [new branch] gh/yanbing-j/23/orig -> origin/gh/yanbing-j/23/orig 2025-08-14T21:20:15.3203219Z * [new branch] gh/yanbing-j/24/base -> origin/gh/yanbing-j/24/base 2025-08-14T21:20:15.3205140Z * [new branch] gh/yanbing-j/24/head -> origin/gh/yanbing-j/24/head 2025-08-14T21:20:15.3206941Z * [new branch] gh/yanbing-j/24/orig -> origin/gh/yanbing-j/24/orig 2025-08-14T21:20:15.3209445Z * [new branch] gh/yanbing-j/25/base -> origin/gh/yanbing-j/25/base 2025-08-14T21:20:15.3211272Z * [new branch] gh/yanbing-j/25/head -> origin/gh/yanbing-j/25/head 2025-08-14T21:20:15.3213005Z * [new branch] gh/yanbing-j/25/orig -> origin/gh/yanbing-j/25/orig 2025-08-14T21:20:15.3215540Z * [new branch] gh/yanbing-j/26/base -> origin/gh/yanbing-j/26/base 2025-08-14T21:20:15.3217428Z * [new branch] gh/yanbing-j/26/head -> origin/gh/yanbing-j/26/head 2025-08-14T21:20:15.3219300Z * [new branch] gh/yanbing-j/26/orig -> origin/gh/yanbing-j/26/orig 2025-08-14T21:20:15.3221829Z * [new branch] gh/yanbing-j/36/base -> origin/gh/yanbing-j/36/base 2025-08-14T21:20:15.3223643Z * [new branch] gh/yanbing-j/36/head -> origin/gh/yanbing-j/36/head 2025-08-14T21:20:15.3225615Z * [new branch] gh/yanbing-j/36/orig -> origin/gh/yanbing-j/36/orig 2025-08-14T21:20:15.3228159Z * [new branch] gh/yanbing-j/37/base -> origin/gh/yanbing-j/37/base 2025-08-14T21:20:15.3229932Z * [new branch] gh/yanbing-j/37/head -> origin/gh/yanbing-j/37/head 2025-08-14T21:20:15.3231738Z * [new branch] gh/yanbing-j/37/orig -> origin/gh/yanbing-j/37/orig 2025-08-14T21:20:15.3234189Z * [new branch] gh/yanbing-j/39/base -> origin/gh/yanbing-j/39/base 2025-08-14T21:20:15.3236064Z * [new branch] gh/yanbing-j/39/head -> origin/gh/yanbing-j/39/head 2025-08-14T21:20:15.3237809Z * [new branch] gh/yanbing-j/39/orig -> origin/gh/yanbing-j/39/orig 2025-08-14T21:20:15.3241430Z * [new branch] gh/yangw-dev/1/base -> origin/gh/yangw-dev/1/base 2025-08-14T21:20:15.3243913Z * [new branch] gh/yangw-dev/10/base -> origin/gh/yangw-dev/10/base 2025-08-14T21:20:15.3245978Z * [new branch] gh/yangw-dev/10/head -> origin/gh/yangw-dev/10/head 2025-08-14T21:20:15.3248548Z * [new branch] gh/yangw-dev/10/orig -> origin/gh/yangw-dev/10/orig 2025-08-14T21:20:15.3251068Z * [new branch] gh/yangw-dev/11/base -> origin/gh/yangw-dev/11/base 2025-08-14T21:20:15.3252950Z * [new branch] gh/yangw-dev/11/head -> origin/gh/yangw-dev/11/head 2025-08-14T21:20:15.3254744Z * [new branch] gh/yangw-dev/11/orig -> origin/gh/yangw-dev/11/orig 2025-08-14T21:20:15.3257201Z * [new branch] gh/yangw-dev/12/base -> origin/gh/yangw-dev/12/base 2025-08-14T21:20:15.3259019Z * [new branch] gh/yangw-dev/12/head -> origin/gh/yangw-dev/12/head 2025-08-14T21:20:15.3260801Z * [new branch] gh/yangw-dev/12/orig -> origin/gh/yangw-dev/12/orig 2025-08-14T21:20:15.3263324Z * [new branch] gh/yangw-dev/13/base -> origin/gh/yangw-dev/13/base 2025-08-14T21:20:15.3265455Z * [new branch] gh/yangw-dev/13/head -> origin/gh/yangw-dev/13/head 2025-08-14T21:20:15.3267577Z * [new branch] gh/yangw-dev/13/orig -> origin/gh/yangw-dev/13/orig 2025-08-14T21:20:15.3269944Z * [new branch] gh/yangw-dev/14/base -> origin/gh/yangw-dev/14/base 2025-08-14T21:20:15.3271711Z * [new branch] gh/yangw-dev/14/head -> origin/gh/yangw-dev/14/head 2025-08-14T21:20:15.3273491Z * [new branch] gh/yangw-dev/14/orig -> origin/gh/yangw-dev/14/orig 2025-08-14T21:20:15.3275996Z * [new branch] gh/yangw-dev/15/base -> origin/gh/yangw-dev/15/base 2025-08-14T21:20:15.3277765Z * [new branch] gh/yangw-dev/15/head -> origin/gh/yangw-dev/15/head 2025-08-14T21:20:15.3279504Z * [new branch] gh/yangw-dev/15/orig -> origin/gh/yangw-dev/15/orig 2025-08-14T21:20:15.3282038Z * [new branch] gh/yangw-dev/16/base -> origin/gh/yangw-dev/16/base 2025-08-14T21:20:15.3283871Z * [new branch] gh/yangw-dev/16/head -> origin/gh/yangw-dev/16/head 2025-08-14T21:20:15.3285902Z * [new branch] gh/yangw-dev/16/orig -> origin/gh/yangw-dev/16/orig 2025-08-14T21:20:15.3288407Z * [new branch] gh/yangw-dev/17/base -> origin/gh/yangw-dev/17/base 2025-08-14T21:20:15.3290269Z * [new branch] gh/yangw-dev/17/head -> origin/gh/yangw-dev/17/head 2025-08-14T21:20:15.3292106Z * [new branch] gh/yangw-dev/17/orig -> origin/gh/yangw-dev/17/orig 2025-08-14T21:20:15.3294600Z * [new branch] gh/yangw-dev/18/base -> origin/gh/yangw-dev/18/base 2025-08-14T21:20:15.3296423Z * [new branch] gh/yangw-dev/18/head -> origin/gh/yangw-dev/18/head 2025-08-14T21:20:15.3298183Z * [new branch] gh/yangw-dev/18/orig -> origin/gh/yangw-dev/18/orig 2025-08-14T21:20:15.3300646Z * [new branch] gh/yangw-dev/19/base -> origin/gh/yangw-dev/19/base 2025-08-14T21:20:15.3302440Z * [new branch] gh/yangw-dev/19/head -> origin/gh/yangw-dev/19/head 2025-08-14T21:20:15.3304232Z * [new branch] gh/yangw-dev/19/orig -> origin/gh/yangw-dev/19/orig 2025-08-14T21:20:15.3306763Z * [new branch] gh/yangw-dev/2/base -> origin/gh/yangw-dev/2/base 2025-08-14T21:20:15.3308539Z * [new branch] gh/yangw-dev/2/head -> origin/gh/yangw-dev/2/head 2025-08-14T21:20:15.3311526Z * [new branch] gh/yangw-dev/3/base -> origin/gh/yangw-dev/3/base 2025-08-14T21:20:15.3313358Z * [new branch] gh/yangw-dev/3/head -> origin/gh/yangw-dev/3/head 2025-08-14T21:20:15.3315877Z * [new branch] gh/yangw-dev/4/base -> origin/gh/yangw-dev/4/base 2025-08-14T21:20:15.3317616Z * [new branch] gh/yangw-dev/4/head -> origin/gh/yangw-dev/4/head 2025-08-14T21:20:15.3319984Z * [new branch] gh/yangw-dev/5/base -> origin/gh/yangw-dev/5/base 2025-08-14T21:20:15.3321746Z * [new branch] gh/yangw-dev/5/head -> origin/gh/yangw-dev/5/head 2025-08-14T21:20:15.3324188Z * [new branch] gh/yangw-dev/6/base -> origin/gh/yangw-dev/6/base 2025-08-14T21:20:15.3326172Z * [new branch] gh/yangw-dev/6/head -> origin/gh/yangw-dev/6/head 2025-08-14T21:20:15.3328669Z * [new branch] gh/yangw-dev/7/base -> origin/gh/yangw-dev/7/base 2025-08-14T21:20:15.3330428Z * [new branch] gh/yangw-dev/7/head -> origin/gh/yangw-dev/7/head 2025-08-14T21:20:15.3332852Z * [new branch] gh/yangw-dev/8/base -> origin/gh/yangw-dev/8/base 2025-08-14T21:20:15.3334639Z * [new branch] gh/yangw-dev/8/head -> origin/gh/yangw-dev/8/head 2025-08-14T21:20:15.3336411Z * [new branch] gh/yangw-dev/8/orig -> origin/gh/yangw-dev/8/orig 2025-08-14T21:20:15.3338855Z * [new branch] gh/yangw-dev/9/base -> origin/gh/yangw-dev/9/base 2025-08-14T21:20:15.3340823Z * [new branch] gh/yangw-dev/9/head -> origin/gh/yangw-dev/9/head 2025-08-14T21:20:15.3342409Z * [new branch] gh/yangw-dev/9/orig -> origin/gh/yangw-dev/9/orig 2025-08-14T21:20:15.3345545Z * [new branch] gh/ydwu4/233/base -> origin/gh/ydwu4/233/base 2025-08-14T21:20:15.3347432Z * [new branch] gh/ydwu4/233/head -> origin/gh/ydwu4/233/head 2025-08-14T21:20:15.3349490Z * [new branch] gh/ydwu4/233/orig -> origin/gh/ydwu4/233/orig 2025-08-14T21:20:15.3352134Z * [new branch] gh/ydwu4/246/base -> origin/gh/ydwu4/246/base 2025-08-14T21:20:15.3354017Z * [new branch] gh/ydwu4/246/head -> origin/gh/ydwu4/246/head 2025-08-14T21:20:15.3355851Z * [new branch] gh/ydwu4/246/orig -> origin/gh/ydwu4/246/orig 2025-08-14T21:20:15.3358437Z * [new branch] gh/ydwu4/253/base -> origin/gh/ydwu4/253/base 2025-08-14T21:20:15.3360245Z * [new branch] gh/ydwu4/253/head -> origin/gh/ydwu4/253/head 2025-08-14T21:20:15.3362135Z * [new branch] gh/ydwu4/253/orig -> origin/gh/ydwu4/253/orig 2025-08-14T21:20:15.3364847Z * [new branch] gh/ydwu4/255/base -> origin/gh/ydwu4/255/base 2025-08-14T21:20:15.3366708Z * [new branch] gh/ydwu4/255/head -> origin/gh/ydwu4/255/head 2025-08-14T21:20:15.3368489Z * [new branch] gh/ydwu4/255/orig -> origin/gh/ydwu4/255/orig 2025-08-14T21:20:15.3371131Z * [new branch] gh/ydwu4/259/base -> origin/gh/ydwu4/259/base 2025-08-14T21:20:15.3372912Z * [new branch] gh/ydwu4/259/head -> origin/gh/ydwu4/259/head 2025-08-14T21:20:15.3374695Z * [new branch] gh/ydwu4/259/orig -> origin/gh/ydwu4/259/orig 2025-08-14T21:20:15.3377272Z * [new branch] gh/ydwu4/262/base -> origin/gh/ydwu4/262/base 2025-08-14T21:20:15.3379087Z * [new branch] gh/ydwu4/262/head -> origin/gh/ydwu4/262/head 2025-08-14T21:20:15.3380898Z * [new branch] gh/ydwu4/262/orig -> origin/gh/ydwu4/262/orig 2025-08-14T21:20:15.3383385Z * [new branch] gh/ydwu4/263/base -> origin/gh/ydwu4/263/base 2025-08-14T21:20:15.3385274Z * [new branch] gh/ydwu4/263/head -> origin/gh/ydwu4/263/head 2025-08-14T21:20:15.3387063Z * [new branch] gh/ydwu4/263/orig -> origin/gh/ydwu4/263/orig 2025-08-14T21:20:15.3389826Z * [new branch] gh/ydwu4/269/base -> origin/gh/ydwu4/269/base 2025-08-14T21:20:15.3391557Z * [new branch] gh/ydwu4/269/head -> origin/gh/ydwu4/269/head 2025-08-14T21:20:15.3393372Z * [new branch] gh/ydwu4/269/orig -> origin/gh/ydwu4/269/orig 2025-08-14T21:20:15.3395933Z * [new branch] gh/ydwu4/270/base -> origin/gh/ydwu4/270/base 2025-08-14T21:20:15.3397775Z * [new branch] gh/ydwu4/270/head -> origin/gh/ydwu4/270/head 2025-08-14T21:20:15.3399581Z * [new branch] gh/ydwu4/270/orig -> origin/gh/ydwu4/270/orig 2025-08-14T21:20:15.3402101Z * [new branch] gh/ydwu4/272/base -> origin/gh/ydwu4/272/base 2025-08-14T21:20:15.3404056Z * [new branch] gh/ydwu4/272/head -> origin/gh/ydwu4/272/head 2025-08-14T21:20:15.3406082Z * [new branch] gh/ydwu4/272/orig -> origin/gh/ydwu4/272/orig 2025-08-14T21:20:15.3408512Z * [new branch] gh/ydwu4/275/base -> origin/gh/ydwu4/275/base 2025-08-14T21:20:15.3410357Z * [new branch] gh/ydwu4/275/head -> origin/gh/ydwu4/275/head 2025-08-14T21:20:15.3412226Z * [new branch] gh/ydwu4/275/orig -> origin/gh/ydwu4/275/orig 2025-08-14T21:20:15.3414671Z * [new branch] gh/ydwu4/276/base -> origin/gh/ydwu4/276/base 2025-08-14T21:20:15.3416705Z * [new branch] gh/ydwu4/276/head -> origin/gh/ydwu4/276/head 2025-08-14T21:20:15.3418360Z * [new branch] gh/ydwu4/276/orig -> origin/gh/ydwu4/276/orig 2025-08-14T21:20:15.3420815Z * [new branch] gh/ydwu4/277/base -> origin/gh/ydwu4/277/base 2025-08-14T21:20:15.3422690Z * [new branch] gh/ydwu4/277/head -> origin/gh/ydwu4/277/head 2025-08-14T21:20:15.3424575Z * [new branch] gh/ydwu4/277/orig -> origin/gh/ydwu4/277/orig 2025-08-14T21:20:15.3427024Z * [new branch] gh/ydwu4/278/base -> origin/gh/ydwu4/278/base 2025-08-14T21:20:15.3428813Z * [new branch] gh/ydwu4/278/head -> origin/gh/ydwu4/278/head 2025-08-14T21:20:15.3430584Z * [new branch] gh/ydwu4/278/orig -> origin/gh/ydwu4/278/orig 2025-08-14T21:20:15.3433297Z * [new branch] gh/ydwu4/279/base -> origin/gh/ydwu4/279/base 2025-08-14T21:20:15.3435344Z * [new branch] gh/ydwu4/279/head -> origin/gh/ydwu4/279/head 2025-08-14T21:20:15.3437145Z * [new branch] gh/ydwu4/279/orig -> origin/gh/ydwu4/279/orig 2025-08-14T21:20:15.3440500Z * [new branch] gh/ydwu4/280/base -> origin/gh/ydwu4/280/base 2025-08-14T21:20:15.3442294Z * [new branch] gh/ydwu4/280/head -> origin/gh/ydwu4/280/head 2025-08-14T21:20:15.3444169Z * [new branch] gh/ydwu4/280/orig -> origin/gh/ydwu4/280/orig 2025-08-14T21:20:15.3447690Z * [new branch] gh/ydwu4/281/base -> origin/gh/ydwu4/281/base 2025-08-14T21:20:15.3449753Z * [new branch] gh/ydwu4/281/head -> origin/gh/ydwu4/281/head 2025-08-14T21:20:15.3451596Z * [new branch] gh/ydwu4/281/orig -> origin/gh/ydwu4/281/orig 2025-08-14T21:20:15.3454130Z * [new branch] gh/ydwu4/282/base -> origin/gh/ydwu4/282/base 2025-08-14T21:20:15.3455971Z * [new branch] gh/ydwu4/282/head -> origin/gh/ydwu4/282/head 2025-08-14T21:20:15.3457796Z * [new branch] gh/ydwu4/282/orig -> origin/gh/ydwu4/282/orig 2025-08-14T21:20:15.3460241Z * [new branch] gh/ydwu4/283/base -> origin/gh/ydwu4/283/base 2025-08-14T21:20:15.3462122Z * [new branch] gh/ydwu4/283/head -> origin/gh/ydwu4/283/head 2025-08-14T21:20:15.3463908Z * [new branch] gh/ydwu4/283/orig -> origin/gh/ydwu4/283/orig 2025-08-14T21:20:15.3466461Z * [new branch] gh/ydwu4/284/base -> origin/gh/ydwu4/284/base 2025-08-14T21:20:15.3468304Z * [new branch] gh/ydwu4/284/head -> origin/gh/ydwu4/284/head 2025-08-14T21:20:15.3470111Z * [new branch] gh/ydwu4/284/orig -> origin/gh/ydwu4/284/orig 2025-08-14T21:20:15.3472600Z * [new branch] gh/ydwu4/285/base -> origin/gh/ydwu4/285/base 2025-08-14T21:20:15.3474435Z * [new branch] gh/ydwu4/285/head -> origin/gh/ydwu4/285/head 2025-08-14T21:20:15.3476261Z * [new branch] gh/ydwu4/285/orig -> origin/gh/ydwu4/285/orig 2025-08-14T21:20:15.3478868Z * [new branch] gh/ydwu4/286/base -> origin/gh/ydwu4/286/base 2025-08-14T21:20:15.3480545Z * [new branch] gh/ydwu4/286/head -> origin/gh/ydwu4/286/head 2025-08-14T21:20:15.3482397Z * [new branch] gh/ydwu4/286/orig -> origin/gh/ydwu4/286/orig 2025-08-14T21:20:15.3485183Z * [new branch] gh/ydwu4/287/base -> origin/gh/ydwu4/287/base 2025-08-14T21:20:15.3487635Z * [new branch] gh/ydwu4/287/head -> origin/gh/ydwu4/287/head 2025-08-14T21:20:15.3489391Z * [new branch] gh/ydwu4/287/orig -> origin/gh/ydwu4/287/orig 2025-08-14T21:20:15.3492065Z * [new branch] gh/ydwu4/288/base -> origin/gh/ydwu4/288/base 2025-08-14T21:20:15.3494087Z * [new branch] gh/ydwu4/288/head -> origin/gh/ydwu4/288/head 2025-08-14T21:20:15.3495726Z * [new branch] gh/ydwu4/288/orig -> origin/gh/ydwu4/288/orig 2025-08-14T21:20:15.3498293Z * [new branch] gh/ydwu4/289/base -> origin/gh/ydwu4/289/base 2025-08-14T21:20:15.3500226Z * [new branch] gh/ydwu4/289/head -> origin/gh/ydwu4/289/head 2025-08-14T21:20:15.3501966Z * [new branch] gh/ydwu4/289/orig -> origin/gh/ydwu4/289/orig 2025-08-14T21:20:15.3504612Z * [new branch] gh/ydwu4/290/base -> origin/gh/ydwu4/290/base 2025-08-14T21:20:15.3506464Z * [new branch] gh/ydwu4/290/head -> origin/gh/ydwu4/290/head 2025-08-14T21:20:15.3508332Z * [new branch] gh/ydwu4/290/orig -> origin/gh/ydwu4/290/orig 2025-08-14T21:20:15.3510975Z * [new branch] gh/ydwu4/291/base -> origin/gh/ydwu4/291/base 2025-08-14T21:20:15.3512871Z * [new branch] gh/ydwu4/291/head -> origin/gh/ydwu4/291/head 2025-08-14T21:20:15.3514661Z * [new branch] gh/ydwu4/291/orig -> origin/gh/ydwu4/291/orig 2025-08-14T21:20:15.3517401Z * [new branch] gh/ydwu4/292/base -> origin/gh/ydwu4/292/base 2025-08-14T21:20:15.3519084Z * [new branch] gh/ydwu4/292/head -> origin/gh/ydwu4/292/head 2025-08-14T21:20:15.3520913Z * [new branch] gh/ydwu4/292/orig -> origin/gh/ydwu4/292/orig 2025-08-14T21:20:15.3523393Z * [new branch] gh/ydwu4/293/base -> origin/gh/ydwu4/293/base 2025-08-14T21:20:15.3525553Z * [new branch] gh/ydwu4/293/head -> origin/gh/ydwu4/293/head 2025-08-14T21:20:15.3527221Z * [new branch] gh/ydwu4/293/orig -> origin/gh/ydwu4/293/orig 2025-08-14T21:20:15.3529846Z * [new branch] gh/ydwu4/294/base -> origin/gh/ydwu4/294/base 2025-08-14T21:20:15.3531734Z * [new branch] gh/ydwu4/294/head -> origin/gh/ydwu4/294/head 2025-08-14T21:20:15.3533931Z * [new branch] gh/ydwu4/294/orig -> origin/gh/ydwu4/294/orig 2025-08-14T21:20:15.3536607Z * [new branch] gh/ydwu4/295/base -> origin/gh/ydwu4/295/base 2025-08-14T21:20:15.3538450Z * [new branch] gh/ydwu4/295/head -> origin/gh/ydwu4/295/head 2025-08-14T21:20:15.3540269Z * [new branch] gh/ydwu4/295/orig -> origin/gh/ydwu4/295/orig 2025-08-14T21:20:15.3542801Z * [new branch] gh/ydwu4/296/base -> origin/gh/ydwu4/296/base 2025-08-14T21:20:15.3544540Z * [new branch] gh/ydwu4/296/head -> origin/gh/ydwu4/296/head 2025-08-14T21:20:15.3546355Z * [new branch] gh/ydwu4/296/orig -> origin/gh/ydwu4/296/orig 2025-08-14T21:20:15.3549304Z * [new branch] gh/ydwu4/297/base -> origin/gh/ydwu4/297/base 2025-08-14T21:20:15.3551176Z * [new branch] gh/ydwu4/297/head -> origin/gh/ydwu4/297/head 2025-08-14T21:20:15.3552983Z * [new branch] gh/ydwu4/297/orig -> origin/gh/ydwu4/297/orig 2025-08-14T21:20:15.3555412Z * [new branch] gh/ydwu4/298/base -> origin/gh/ydwu4/298/base 2025-08-14T21:20:15.3557301Z * [new branch] gh/ydwu4/298/head -> origin/gh/ydwu4/298/head 2025-08-14T21:20:15.3559045Z * [new branch] gh/ydwu4/298/orig -> origin/gh/ydwu4/298/orig 2025-08-14T21:20:15.3561502Z * [new branch] gh/ydwu4/299/base -> origin/gh/ydwu4/299/base 2025-08-14T21:20:15.3563314Z * [new branch] gh/ydwu4/299/head -> origin/gh/ydwu4/299/head 2025-08-14T21:20:15.3565312Z * [new branch] gh/ydwu4/299/orig -> origin/gh/ydwu4/299/orig 2025-08-14T21:20:15.3568871Z * [new branch] gh/ydwu4/300/base -> origin/gh/ydwu4/300/base 2025-08-14T21:20:15.3571468Z * [new branch] gh/ydwu4/300/head -> origin/gh/ydwu4/300/head 2025-08-14T21:20:15.3573296Z * [new branch] gh/ydwu4/300/orig -> origin/gh/ydwu4/300/orig 2025-08-14T21:20:15.3575979Z * [new branch] gh/ydwu4/301/base -> origin/gh/ydwu4/301/base 2025-08-14T21:20:15.3577806Z * [new branch] gh/ydwu4/301/head -> origin/gh/ydwu4/301/head 2025-08-14T21:20:15.3579604Z * [new branch] gh/ydwu4/301/orig -> origin/gh/ydwu4/301/orig 2025-08-14T21:20:15.3582165Z * [new branch] gh/ydwu4/302/base -> origin/gh/ydwu4/302/base 2025-08-14T21:20:15.3583966Z * [new branch] gh/ydwu4/302/head -> origin/gh/ydwu4/302/head 2025-08-14T21:20:15.3585746Z * [new branch] gh/ydwu4/302/orig -> origin/gh/ydwu4/302/orig 2025-08-14T21:20:15.3588300Z * [new branch] gh/ydwu4/303/base -> origin/gh/ydwu4/303/base 2025-08-14T21:20:15.3590321Z * [new branch] gh/ydwu4/303/head -> origin/gh/ydwu4/303/head 2025-08-14T21:20:15.3592135Z * [new branch] gh/ydwu4/303/orig -> origin/gh/ydwu4/303/orig 2025-08-14T21:20:15.3594655Z * [new branch] gh/ydwu4/304/base -> origin/gh/ydwu4/304/base 2025-08-14T21:20:15.3597077Z * [new branch] gh/ydwu4/304/head -> origin/gh/ydwu4/304/head 2025-08-14T21:20:15.3599296Z * [new branch] gh/ydwu4/304/orig -> origin/gh/ydwu4/304/orig 2025-08-14T21:20:15.3602019Z * [new branch] gh/ydwu4/305/base -> origin/gh/ydwu4/305/base 2025-08-14T21:20:15.3603926Z * [new branch] gh/ydwu4/305/head -> origin/gh/ydwu4/305/head 2025-08-14T21:20:15.3605874Z * [new branch] gh/ydwu4/305/orig -> origin/gh/ydwu4/305/orig 2025-08-14T21:20:15.3608341Z * [new branch] gh/ydwu4/306/base -> origin/gh/ydwu4/306/base 2025-08-14T21:20:15.3610640Z * [new branch] gh/ydwu4/306/head -> origin/gh/ydwu4/306/head 2025-08-14T21:20:15.3612514Z * [new branch] gh/ydwu4/306/orig -> origin/gh/ydwu4/306/orig 2025-08-14T21:20:15.3615047Z * [new branch] gh/ydwu4/307/base -> origin/gh/ydwu4/307/base 2025-08-14T21:20:15.3616813Z * [new branch] gh/ydwu4/307/head -> origin/gh/ydwu4/307/head 2025-08-14T21:20:15.3618662Z * [new branch] gh/ydwu4/307/orig -> origin/gh/ydwu4/307/orig 2025-08-14T21:20:15.3621344Z * [new branch] gh/ydwu4/308/base -> origin/gh/ydwu4/308/base 2025-08-14T21:20:15.3623180Z * [new branch] gh/ydwu4/308/head -> origin/gh/ydwu4/308/head 2025-08-14T21:20:15.3624933Z * [new branch] gh/ydwu4/308/orig -> origin/gh/ydwu4/308/orig 2025-08-14T21:20:15.3627438Z * [new branch] gh/ydwu4/309/base -> origin/gh/ydwu4/309/base 2025-08-14T21:20:15.3629283Z * [new branch] gh/ydwu4/309/head -> origin/gh/ydwu4/309/head 2025-08-14T21:20:15.3631088Z * [new branch] gh/ydwu4/309/orig -> origin/gh/ydwu4/309/orig 2025-08-14T21:20:15.3633557Z * [new branch] gh/ydwu4/310/base -> origin/gh/ydwu4/310/base 2025-08-14T21:20:15.3635341Z * [new branch] gh/ydwu4/310/head -> origin/gh/ydwu4/310/head 2025-08-14T21:20:15.3637143Z * [new branch] gh/ydwu4/310/orig -> origin/gh/ydwu4/310/orig 2025-08-14T21:20:15.3639488Z * [new branch] gh/ydwu4/311/base -> origin/gh/ydwu4/311/base 2025-08-14T21:20:15.3641363Z * [new branch] gh/ydwu4/311/head -> origin/gh/ydwu4/311/head 2025-08-14T21:20:15.3643685Z * [new branch] gh/ydwu4/311/orig -> origin/gh/ydwu4/311/orig 2025-08-14T21:20:15.3647253Z * [new branch] gh/yf225/133/base -> origin/gh/yf225/133/base 2025-08-14T21:20:15.3649700Z * [new branch] gh/yf225/133/head -> origin/gh/yf225/133/head 2025-08-14T21:20:15.3652019Z * [new branch] gh/yf225/171/base -> origin/gh/yf225/171/base 2025-08-14T21:20:15.3653921Z * [new branch] gh/yf225/171/head -> origin/gh/yf225/171/head 2025-08-14T21:20:15.3655779Z * [new branch] gh/yf225/171/orig -> origin/gh/yf225/171/orig 2025-08-14T21:20:15.3658361Z * [new branch] gh/yf225/172/base -> origin/gh/yf225/172/base 2025-08-14T21:20:15.3660078Z * [new branch] gh/yf225/172/head -> origin/gh/yf225/172/head 2025-08-14T21:20:15.3661809Z * [new branch] gh/yf225/172/orig -> origin/gh/yf225/172/orig 2025-08-14T21:20:15.3664310Z * [new branch] gh/yf225/93/base -> origin/gh/yf225/93/base 2025-08-14T21:20:15.3666118Z * [new branch] gh/yf225/93/head -> origin/gh/yf225/93/head 2025-08-14T21:20:15.3669925Z * [new branch] gh/yifuwang/152/base -> origin/gh/yifuwang/152/base 2025-08-14T21:20:15.3671947Z * [new branch] gh/yifuwang/152/head -> origin/gh/yifuwang/152/head 2025-08-14T21:20:15.3673857Z * [new branch] gh/yifuwang/152/orig -> origin/gh/yifuwang/152/orig 2025-08-14T21:20:15.3676316Z * [new branch] gh/yifuwang/195/base -> origin/gh/yifuwang/195/base 2025-08-14T21:20:15.3678217Z * [new branch] gh/yifuwang/195/head -> origin/gh/yifuwang/195/head 2025-08-14T21:20:15.3679952Z * [new branch] gh/yifuwang/195/orig -> origin/gh/yifuwang/195/orig 2025-08-14T21:20:15.3683090Z * [new branch] gh/yiming0416/1/base -> origin/gh/yiming0416/1/base 2025-08-14T21:20:15.3685106Z * [new branch] gh/yiming0416/1/head -> origin/gh/yiming0416/1/head 2025-08-14T21:20:15.3687423Z * [new branch] gh/yiming0416/2/base -> origin/gh/yiming0416/2/base 2025-08-14T21:20:15.3689116Z * [new branch] gh/yiming0416/2/head -> origin/gh/yiming0416/2/head 2025-08-14T21:20:15.3692302Z * [new branch] gh/ysiraichi/79/base -> origin/gh/ysiraichi/79/base 2025-08-14T21:20:15.3694066Z * [new branch] gh/ysiraichi/79/head -> origin/gh/ysiraichi/79/head 2025-08-14T21:20:15.3696106Z * [new branch] gh/ysiraichi/79/orig -> origin/gh/ysiraichi/79/orig 2025-08-14T21:20:15.3698560Z * [new branch] gh/ysiraichi/81/base -> origin/gh/ysiraichi/81/base 2025-08-14T21:20:15.3700348Z * [new branch] gh/ysiraichi/81/head -> origin/gh/ysiraichi/81/head 2025-08-14T21:20:15.3702121Z * [new branch] gh/ysiraichi/81/orig -> origin/gh/ysiraichi/81/orig 2025-08-14T21:20:15.3705347Z * [new branch] gh/ysiraichi/84/base -> origin/gh/ysiraichi/84/base 2025-08-14T21:20:15.3707250Z * [new branch] gh/ysiraichi/84/head -> origin/gh/ysiraichi/84/head 2025-08-14T21:20:15.3709078Z * [new branch] gh/ysiraichi/84/orig -> origin/gh/ysiraichi/84/orig 2025-08-14T21:20:15.3711645Z * [new branch] gh/ysiraichi/85/base -> origin/gh/ysiraichi/85/base 2025-08-14T21:20:15.3713555Z * [new branch] gh/ysiraichi/85/head -> origin/gh/ysiraichi/85/head 2025-08-14T21:20:15.3715824Z * [new branch] gh/ysiraichi/85/orig -> origin/gh/ysiraichi/85/orig 2025-08-14T21:20:15.3718603Z * [new branch] gh/ysiraichi/86/base -> origin/gh/ysiraichi/86/base 2025-08-14T21:20:15.3720523Z * [new branch] gh/ysiraichi/86/head -> origin/gh/ysiraichi/86/head 2025-08-14T21:20:15.3722379Z * [new branch] gh/ysiraichi/86/orig -> origin/gh/ysiraichi/86/orig 2025-08-14T21:20:15.3725382Z * [new branch] gh/ysiraichi/87/base -> origin/gh/ysiraichi/87/base 2025-08-14T21:20:15.3727071Z * [new branch] gh/ysiraichi/87/head -> origin/gh/ysiraichi/87/head 2025-08-14T21:20:15.3728921Z * [new branch] gh/ysiraichi/87/orig -> origin/gh/ysiraichi/87/orig 2025-08-14T21:20:15.3731348Z * [new branch] gh/ysiraichi/88/base -> origin/gh/ysiraichi/88/base 2025-08-14T21:20:15.3733083Z * [new branch] gh/ysiraichi/88/head -> origin/gh/ysiraichi/88/head 2025-08-14T21:20:15.3734910Z * [new branch] gh/ysiraichi/88/orig -> origin/gh/ysiraichi/88/orig 2025-08-14T21:20:15.3737963Z * [new branch] gh/yuguo68/1/base -> origin/gh/yuguo68/1/base 2025-08-14T21:20:15.3739910Z * [new branch] gh/yuguo68/1/head -> origin/gh/yuguo68/1/head 2025-08-14T21:20:15.3741725Z * [new branch] gh/yuguo68/1/orig -> origin/gh/yuguo68/1/orig 2025-08-14T21:20:15.3744315Z * [new branch] gh/yuguo68/2/base -> origin/gh/yuguo68/2/base 2025-08-14T21:20:15.3746154Z * [new branch] gh/yuguo68/2/head -> origin/gh/yuguo68/2/head 2025-08-14T21:20:15.3747996Z * [new branch] gh/yuguo68/2/orig -> origin/gh/yuguo68/2/orig 2025-08-14T21:20:15.3754212Z * [new branch] gh/zhxchen17/25/base -> origin/gh/zhxchen17/25/base 2025-08-14T21:20:15.3756081Z * [new branch] gh/zhxchen17/25/head -> origin/gh/zhxchen17/25/head 2025-08-14T21:20:15.3757836Z * [new branch] gh/zhxchen17/25/orig -> origin/gh/zhxchen17/25/orig 2025-08-14T21:20:15.3760563Z * [new branch] gh/zhxchen17/31/base -> origin/gh/zhxchen17/31/base 2025-08-14T21:20:15.3762458Z * [new branch] gh/zhxchen17/31/head -> origin/gh/zhxchen17/31/head 2025-08-14T21:20:15.3765305Z * [new branch] gh/zhxchen17/31/orig -> origin/gh/zhxchen17/31/orig 2025-08-14T21:20:15.3768001Z * [new branch] gh/zhxchen17/33/base -> origin/gh/zhxchen17/33/base 2025-08-14T21:20:15.3769980Z * [new branch] gh/zhxchen17/33/head -> origin/gh/zhxchen17/33/head 2025-08-14T21:20:15.3771812Z * [new branch] gh/zhxchen17/33/orig -> origin/gh/zhxchen17/33/orig 2025-08-14T21:20:15.3774268Z * [new branch] gh/zhxchen17/34/base -> origin/gh/zhxchen17/34/base 2025-08-14T21:20:15.3776213Z * [new branch] gh/zhxchen17/34/head -> origin/gh/zhxchen17/34/head 2025-08-14T21:20:15.3778480Z * [new branch] gh/zhxchen17/35/base -> origin/gh/zhxchen17/35/base 2025-08-14T21:20:15.3780321Z * [new branch] gh/zhxchen17/35/head -> origin/gh/zhxchen17/35/head 2025-08-14T21:20:15.3782633Z * [new branch] gh/zhxchen17/36/base -> origin/gh/zhxchen17/36/base 2025-08-14T21:20:15.3784452Z * [new branch] gh/zhxchen17/36/head -> origin/gh/zhxchen17/36/head 2025-08-14T21:20:15.3786376Z * [new branch] gh/zhxchen17/36/orig -> origin/gh/zhxchen17/36/orig 2025-08-14T21:20:15.3789391Z * [new branch] gh/zklaus/1/base -> origin/gh/zklaus/1/base 2025-08-14T21:20:15.3791228Z * [new branch] gh/zklaus/1/head -> origin/gh/zklaus/1/head 2025-08-14T21:20:15.3793156Z * [new branch] gh/zklaus/1/orig -> origin/gh/zklaus/1/orig 2025-08-14T21:20:15.3795758Z * [new branch] gh/zklaus/10/base -> origin/gh/zklaus/10/base 2025-08-14T21:20:15.3797534Z * [new branch] gh/zklaus/10/head -> origin/gh/zklaus/10/head 2025-08-14T21:20:15.3799300Z * [new branch] gh/zklaus/10/orig -> origin/gh/zklaus/10/orig 2025-08-14T21:20:15.3801704Z * [new branch] gh/zklaus/11/base -> origin/gh/zklaus/11/base 2025-08-14T21:20:15.3803486Z * [new branch] gh/zklaus/11/head -> origin/gh/zklaus/11/head 2025-08-14T21:20:15.3805787Z * [new branch] gh/zklaus/11/orig -> origin/gh/zklaus/11/orig 2025-08-14T21:20:15.3807913Z * [new branch] gh/zklaus/12/base -> origin/gh/zklaus/12/base 2025-08-14T21:20:15.3809811Z * [new branch] gh/zklaus/12/head -> origin/gh/zklaus/12/head 2025-08-14T21:20:15.3811599Z * [new branch] gh/zklaus/12/orig -> origin/gh/zklaus/12/orig 2025-08-14T21:20:15.3814166Z * [new branch] gh/zklaus/14/base -> origin/gh/zklaus/14/base 2025-08-14T21:20:15.3816020Z * [new branch] gh/zklaus/14/head -> origin/gh/zklaus/14/head 2025-08-14T21:20:15.3817853Z * [new branch] gh/zklaus/14/orig -> origin/gh/zklaus/14/orig 2025-08-14T21:20:15.3820387Z * [new branch] gh/zklaus/15/base -> origin/gh/zklaus/15/base 2025-08-14T21:20:15.3822304Z * [new branch] gh/zklaus/15/head -> origin/gh/zklaus/15/head 2025-08-14T21:20:15.3824148Z * [new branch] gh/zklaus/15/orig -> origin/gh/zklaus/15/orig 2025-08-14T21:20:15.3826702Z * [new branch] gh/zklaus/16/base -> origin/gh/zklaus/16/base 2025-08-14T21:20:15.3828546Z * [new branch] gh/zklaus/16/head -> origin/gh/zklaus/16/head 2025-08-14T21:20:15.3830360Z * [new branch] gh/zklaus/16/orig -> origin/gh/zklaus/16/orig 2025-08-14T21:20:15.3832801Z * [new branch] gh/zklaus/17/base -> origin/gh/zklaus/17/base 2025-08-14T21:20:15.3834637Z * [new branch] gh/zklaus/17/head -> origin/gh/zklaus/17/head 2025-08-14T21:20:15.3836436Z * [new branch] gh/zklaus/17/orig -> origin/gh/zklaus/17/orig 2025-08-14T21:20:15.3838879Z * [new branch] gh/zklaus/18/base -> origin/gh/zklaus/18/base 2025-08-14T21:20:15.3840668Z * [new branch] gh/zklaus/18/head -> origin/gh/zklaus/18/head 2025-08-14T21:20:15.3842549Z * [new branch] gh/zklaus/18/orig -> origin/gh/zklaus/18/orig 2025-08-14T21:20:15.3845649Z * [new branch] gh/zklaus/19/base -> origin/gh/zklaus/19/base 2025-08-14T21:20:15.3847458Z * [new branch] gh/zklaus/19/head -> origin/gh/zklaus/19/head 2025-08-14T21:20:15.3849437Z * [new branch] gh/zklaus/19/orig -> origin/gh/zklaus/19/orig 2025-08-14T21:20:15.3851877Z * [new branch] gh/zklaus/7/base -> origin/gh/zklaus/7/base 2025-08-14T21:20:15.3853696Z * [new branch] gh/zklaus/7/head -> origin/gh/zklaus/7/head 2025-08-14T21:20:15.3855445Z * [new branch] gh/zklaus/7/orig -> origin/gh/zklaus/7/orig 2025-08-14T21:20:15.3857861Z * [new branch] gh/zklaus/9/base -> origin/gh/zklaus/9/base 2025-08-14T21:20:15.3860244Z * [new branch] gh/zklaus/9/head -> origin/gh/zklaus/9/head 2025-08-14T21:20:15.3862049Z * [new branch] gh/zklaus/9/orig -> origin/gh/zklaus/9/orig 2025-08-14T21:20:15.3865129Z * [new branch] gh/zou3519/1175/base -> origin/gh/zou3519/1175/base 2025-08-14T21:20:15.3867109Z * [new branch] gh/zou3519/1175/head -> origin/gh/zou3519/1175/head 2025-08-14T21:20:15.3868907Z * [new branch] gh/zou3519/1175/orig -> origin/gh/zou3519/1175/orig 2025-08-14T21:20:15.3871426Z * [new branch] gh/zou3519/1177/base -> origin/gh/zou3519/1177/base 2025-08-14T21:20:15.3873260Z * [new branch] gh/zou3519/1177/head -> origin/gh/zou3519/1177/head 2025-08-14T21:20:15.3875074Z * [new branch] gh/zou3519/1177/orig -> origin/gh/zou3519/1177/orig 2025-08-14T21:20:15.3877629Z * [new branch] gh/zou3519/1187/base -> origin/gh/zou3519/1187/base 2025-08-14T21:20:15.3879405Z * [new branch] gh/zou3519/1187/head -> origin/gh/zou3519/1187/head 2025-08-14T21:20:15.3881386Z * [new branch] gh/zou3519/1187/orig -> origin/gh/zou3519/1187/orig 2025-08-14T21:20:15.3883778Z * [new branch] gh/zou3519/1188/base -> origin/gh/zou3519/1188/base 2025-08-14T21:20:15.3885866Z * [new branch] gh/zou3519/1188/head -> origin/gh/zou3519/1188/head 2025-08-14T21:20:15.3887772Z * [new branch] gh/zou3519/1188/orig -> origin/gh/zou3519/1188/orig 2025-08-14T21:20:15.3890126Z * [new branch] gh/zou3519/1189/base -> origin/gh/zou3519/1189/base 2025-08-14T21:20:15.3892027Z * [new branch] gh/zou3519/1189/head -> origin/gh/zou3519/1189/head 2025-08-14T21:20:15.3893833Z * [new branch] gh/zou3519/1189/orig -> origin/gh/zou3519/1189/orig 2025-08-14T21:20:15.3896391Z * [new branch] gh/zou3519/1190/base -> origin/gh/zou3519/1190/base 2025-08-14T21:20:15.3898168Z * [new branch] gh/zou3519/1190/head -> origin/gh/zou3519/1190/head 2025-08-14T21:20:15.3900021Z * [new branch] gh/zou3519/1190/orig -> origin/gh/zou3519/1190/orig 2025-08-14T21:20:15.3902656Z * [new branch] gh/zou3519/1191/base -> origin/gh/zou3519/1191/base 2025-08-14T21:20:15.3904640Z * [new branch] gh/zou3519/1191/head -> origin/gh/zou3519/1191/head 2025-08-14T21:20:15.3906519Z * [new branch] gh/zou3519/1191/orig -> origin/gh/zou3519/1191/orig 2025-08-14T21:20:15.3909604Z * [new branch] gh/zpcore/1/base -> origin/gh/zpcore/1/base 2025-08-14T21:20:15.3911374Z * [new branch] gh/zpcore/1/head -> origin/gh/zpcore/1/head 2025-08-14T21:20:15.3913976Z * [new branch] gh/zpcore/10/base -> origin/gh/zpcore/10/base 2025-08-14T21:20:15.3915752Z * [new branch] gh/zpcore/10/head -> origin/gh/zpcore/10/head 2025-08-14T21:20:15.3917511Z * [new branch] gh/zpcore/10/orig -> origin/gh/zpcore/10/orig 2025-08-14T21:20:15.3920087Z * [new branch] gh/zpcore/11/base -> origin/gh/zpcore/11/base 2025-08-14T21:20:15.3921972Z * [new branch] gh/zpcore/11/head -> origin/gh/zpcore/11/head 2025-08-14T21:20:15.3924163Z * [new branch] gh/zpcore/11/orig -> origin/gh/zpcore/11/orig 2025-08-14T21:20:15.3926708Z * [new branch] gh/zpcore/12/base -> origin/gh/zpcore/12/base 2025-08-14T21:20:15.3928650Z * [new branch] gh/zpcore/12/head -> origin/gh/zpcore/12/head 2025-08-14T21:20:15.3930467Z * [new branch] gh/zpcore/12/orig -> origin/gh/zpcore/12/orig 2025-08-14T21:20:15.3932913Z * [new branch] gh/zpcore/2/base -> origin/gh/zpcore/2/base 2025-08-14T21:20:15.3935381Z * [new branch] gh/zpcore/2/head -> origin/gh/zpcore/2/head 2025-08-14T21:20:15.3937700Z * [new branch] gh/zpcore/3/base -> origin/gh/zpcore/3/base 2025-08-14T21:20:15.3939472Z * [new branch] gh/zpcore/3/head -> origin/gh/zpcore/3/head 2025-08-14T21:20:15.3941854Z * [new branch] gh/zpcore/4/base -> origin/gh/zpcore/4/base 2025-08-14T21:20:15.3943620Z * [new branch] gh/zpcore/4/head -> origin/gh/zpcore/4/head 2025-08-14T21:20:15.3945990Z * [new branch] gh/zpcore/5/base -> origin/gh/zpcore/5/base 2025-08-14T21:20:15.3948045Z * [new branch] gh/zpcore/5/head -> origin/gh/zpcore/5/head 2025-08-14T21:20:15.3950349Z * [new branch] gh/zpcore/6/base -> origin/gh/zpcore/6/base 2025-08-14T21:20:15.3952101Z * [new branch] gh/zpcore/6/head -> origin/gh/zpcore/6/head 2025-08-14T21:20:15.3954436Z * [new branch] gh/zpcore/7/base -> origin/gh/zpcore/7/base 2025-08-14T21:20:15.3956254Z * [new branch] gh/zpcore/7/head -> origin/gh/zpcore/7/head 2025-08-14T21:20:15.3958739Z * [new branch] gh/zpcore/8/base -> origin/gh/zpcore/8/base 2025-08-14T21:20:15.3960374Z * [new branch] gh/zpcore/8/head -> origin/gh/zpcore/8/head 2025-08-14T21:20:15.3963423Z * [new branch] gh/zpcore/9/head -> origin/gh/zpcore/9/head 2025-08-14T21:20:15.3965553Z * [new branch] gh/zpcore/9/orig -> origin/gh/zpcore/9/orig 2025-08-14T21:20:15.3967660Z * [new branch] google-main -> origin/google-main 2025-08-14T21:20:15.3970168Z * [new branch] guangyey/external_stream -> origin/guangyey/external_stream 2025-08-14T21:20:15.3971857Z * [new branch] guangyey/host_alloc -> origin/guangyey/host_alloc 2025-08-14T21:20:15.3973539Z * [new branch] guangyey/test_2025 -> origin/guangyey/test_2025 2025-08-14T21:20:15.3976274Z * [new branch] guilhermeleobas/cherry-pick-55d87d9dfd9 -> origin/guilhermeleobas/cherry-pick-55d87d9dfd9 2025-08-14T21:20:15.3978636Z * [new branch] haozhe/bf16-dynamic-shape -> origin/haozhe/bf16-dynamic-shape 2025-08-14T21:20:15.3980525Z * [new branch] hc_baseline -> origin/hc_baseline 2025-08-14T21:20:15.3982524Z * [new branch] headeronlyScalarType -> origin/headeronlyScalarType 2025-08-14T21:20:15.3984484Z * [new branch] hf_update -> origin/hf_update 2025-08-14T21:20:15.3986483Z * [new branch] hhh_decomp_mul -> origin/hhh_decomp_mul 2025-08-14T21:20:15.3988360Z * [new branch] hhh_rand -> origin/hhh_rand 2025-08-14T21:20:15.3990999Z * [new branch] hoy/mmsplitk -> origin/hoy/mmsplitk 2025-08-14T21:20:15.3992720Z * [new branch] hoy/triton-PR3973 -> origin/hoy/triton-PR3973 2025-08-14T21:20:15.3994619Z * [new branch] hoy/triton-coalescing-baseline -> origin/hoy/triton-coalescing-baseline 2025-08-14T21:20:15.3996313Z * [new branch] hoy/triton-coalescing-min -> origin/hoy/triton-coalescing-min 2025-08-14T21:20:15.3998026Z * [new branch] hoy/triton-coalescing-new -> origin/hoy/triton-coalescing-new 2025-08-14T21:20:15.4000091Z * [new branch] hoy/triton-coalescing-vec -> origin/hoy/triton-coalescing-vec 2025-08-14T21:20:15.4001982Z * [new branch] inductordecompfix -> origin/inductordecompfix 2025-08-14T21:20:15.4003954Z * [new branch] inline -> origin/inline 2025-08-14T21:20:15.4006615Z * [new branch] inlining -> origin/inlining 2025-08-14T21:20:15.4008657Z * [new branch] inlining-ezyang -> origin/inlining-ezyang 2025-08-14T21:20:15.4010590Z * [new branch] int8_sdpa -> origin/int8_sdpa 2025-08-14T21:20:15.4012669Z * [new branch] invoke-subgraph -> origin/invoke-subgraph 2025-08-14T21:20:15.4014861Z * [new branch] issue#58739 -> origin/issue#58739 2025-08-14T21:20:15.4016751Z * [new branch] issue-154849 -> origin/issue-154849 2025-08-14T21:20:15.4019500Z * [new branch] ivanov/cherry-pick-ckpt-fixes -> origin/ivanov/cherry-pick-ckpt-fixes 2025-08-14T21:20:15.4022469Z * [new branch] jcaip/test-cusparselt-version-0.6.2 -> origin/jcaip/test-cusparselt-version-0.6.2 2025-08-14T21:20:15.4023866Z * [new branch] jcaip/update-cusparselt-0.6.2 -> origin/jcaip/update-cusparselt-0.6.2 2025-08-14T21:20:15.4025736Z * [new branch] jithunnair-amd-patch-1 -> origin/jithunnair-amd-patch-1 2025-08-14T21:20:15.4028072Z * [new branch] justinchu/attention-tests -> origin/justinchu/attention-tests 2025-08-14T21:20:15.4029733Z * [new branch] justinchu/native-qdq -> origin/justinchu/native-qdq 2025-08-14T21:20:15.4032530Z * [new branch] justinchuby/JitScalarType -> origin/justinchuby/JitScalarType 2025-08-14T21:20:15.4034192Z * [new branch] justinchuby/dynamo-true -> origin/justinchuby/dynamo-true 2025-08-14T21:20:15.4035925Z * [new branch] justinchuby/opset-20 -> origin/justinchuby/opset-20 2025-08-14T21:20:15.4038393Z * [new branch] kainan666/xlf_debug -> origin/kainan666/xlf_debug 2025-08-14T21:20:15.4045841Z * [new branch] kainan_test -> origin/kainan_test 2025-08-14T21:20:15.4046133Z * [new branch] leslie/enable_poc_reduction_fusion -> origin/leslie/enable_poc_reduction_fusion 2025-08-14T21:20:15.4046407Z * [new branch] leslie/test_group_gemm_epilogues -> origin/leslie/test_group_gemm_epilogues 2025-08-14T21:20:15.4047559Z * [new branch] lessw2020/fix_cutlass_cache_error -> origin/lessw2020/fix_cutlass_cache_error 2025-08-14T21:20:15.4050803Z * [new branch] liaoxuan/shm_all_reduce -> origin/liaoxuan/shm_all_reduce 2025-08-14T21:20:15.4052237Z * [new branch] liaoxuan/tags_issue -> origin/liaoxuan/tags_issue 2025-08-14T21:20:15.4054326Z * [new branch] liaoxuan/test_fa_disable_softmax -> origin/liaoxuan/test_fa_disable_softmax 2025-08-14T21:20:15.4055705Z * [new branch] liaoxuan/test_int8_sdpa -> origin/liaoxuan/test_int8_sdpa 2025-08-14T21:20:15.4057543Z * [new branch] lintbuilddocker -> origin/lintbuilddocker 2025-08-14T21:20:15.4059332Z * [new branch] llama4-stable -> origin/llama4-stable 2025-08-14T21:20:15.4061281Z * [new branch] logdetfix -> origin/logdetfix 2025-08-14T21:20:15.4064636Z * [new branch] lts/release/1.8 -> origin/lts/release/1.8 2025-08-14T21:20:15.4067290Z * [new branch] lucaskabela/#94773 -> origin/lucaskabela/#94773 2025-08-14T21:20:15.4069085Z * [new branch] lucaskabela/fix_157452 -> origin/lucaskabela/fix_157452 2025-08-14T21:20:15.4070871Z * [new branch] lucaskabela/fix_circular_import_158120 -> origin/lucaskabela/fix_circular_import_158120 2025-08-14T21:20:15.4072538Z * [new branch] lucaskabela/func_under_decomp -> origin/lucaskabela/func_under_decomp 2025-08-14T21:20:15.4074229Z * [new branch] lucaskabela/functional_in_dynamo -> origin/lucaskabela/functional_in_dynamo 2025-08-14T21:20:15.4076436Z * [new branch] lucaskabela/install_params_as_graph_attr -> origin/lucaskabela/install_params_as_graph_attr 2025-08-14T21:20:15.4078397Z * [new branch] lucaskabela/issue_120648 -> origin/lucaskabela/issue_120648 2025-08-14T21:20:15.4080691Z * [new branch] lucaskabela/parameters_as_graph_attr -> origin/lucaskabela/parameters_as_graph_attr 2025-08-14T21:20:15.4082434Z * [new branch] lucaskabela/registry_fix -> origin/lucaskabela/registry_fix 2025-08-14T21:20:15.4084537Z * [new branch] lucaskabela/remove_aot_dispatcher_metadata -> origin/lucaskabela/remove_aot_dispatcher_metadata 2025-08-14T21:20:15.4086378Z * [new branch] lucaskabela/type_guards -> origin/lucaskabela/type_guards 2025-08-14T21:20:15.4088205Z * [new branch] lucaskabela/typing-misc -> origin/lucaskabela/typing-misc 2025-08-14T21:20:15.4090007Z * [new branch] lucaskabela/typing_backends -> origin/lucaskabela/typing_backends 2025-08-14T21:20:15.4091970Z * [new branch] lucaskabela/typing_bytecode_analysis_transform -> origin/lucaskabela/typing_bytecode_analysis_transform 2025-08-14T21:20:15.4093678Z * [new branch] lucaskabela/typing_cache_files -> origin/lucaskabela/typing_cache_files 2025-08-14T21:20:15.4095581Z * [new branch] lucaskabela/typing_compile_autograd -> origin/lucaskabela/typing_compile_autograd 2025-08-14T21:20:15.4097607Z * [new branch] lucaskabela/typing_debug_utils.py -> origin/lucaskabela/typing_debug_utils.py 2025-08-14T21:20:15.4099172Z * [new branch] lucaskabela/typing_decorators -> origin/lucaskabela/typing_decorators 2025-08-14T21:20:15.4101001Z * [new branch] lucaskabela/typing_eval_frame -> origin/lucaskabela/typing_eval_frame 2025-08-14T21:20:15.4102818Z * [new branch] lucaskabela/typing_for_codegen -> origin/lucaskabela/typing_for_codegen 2025-08-14T21:20:15.4104616Z * [new branch] lucaskabela/typing_output_graph -> origin/lucaskabela/typing_output_graph 2025-08-14T21:20:15.4106461Z * [new branch] lucaskabela/typing_side_effects -> origin/lucaskabela/typing_side_effects 2025-08-14T21:20:15.4108540Z * [new branch] lucaskabela/typing_source_guard -> origin/lucaskabela/typing_source_guard 2025-08-14T21:20:15.4110276Z * [new branch] lucaskabela/typing_trace_rules -> origin/lucaskabela/typing_trace_rules 2025-08-14T21:20:15.4112085Z * [new branch] lucaskabela/typing_utils.py -> origin/lucaskabela/typing_utils.py 2025-08-14T21:20:15.4113961Z * [new branch] lucaskabela/typing_utils_improvements -> origin/lucaskabela/typing_utils_improvements 2025-08-14T21:20:15.4115744Z * [new branch] main -> origin/main 2025-08-14T21:20:15.4118079Z * [new branch] main-enable-b200-distributed-tests -> origin/main-enable-b200-distributed-tests 2025-08-14T21:20:15.4120058Z * [new branch] malfet-patch-1 -> origin/malfet-patch-1 2025-08-14T21:20:15.4121973Z * [new branch] malfet-patch-10 -> origin/malfet-patch-10 2025-08-14T21:20:15.4124070Z * [new branch] malfet-patch-11 -> origin/malfet-patch-11 2025-08-14T21:20:15.4126150Z * [new branch] malfet-patch-13 -> origin/malfet-patch-13 2025-08-14T21:20:15.4128272Z * [new branch] malfet-patch-14 -> origin/malfet-patch-14 2025-08-14T21:20:15.4130154Z * [new branch] malfet-patch-2 -> origin/malfet-patch-2 2025-08-14T21:20:15.4132326Z * [new branch] malfet-patch-3 -> origin/malfet-patch-3 2025-08-14T21:20:15.4134310Z * [new branch] malfet-patch-4 -> origin/malfet-patch-4 2025-08-14T21:20:15.4136396Z * [new branch] malfet-patch-5 -> origin/malfet-patch-5 2025-08-14T21:20:15.4138367Z * [new branch] malfet-patch-6 -> origin/malfet-patch-6 2025-08-14T21:20:15.4140374Z * [new branch] malfet-patch-7 -> origin/malfet-patch-7 2025-08-14T21:20:15.4142388Z * [new branch] malfet-patch-8 -> origin/malfet-patch-8 2025-08-14T21:20:15.4144332Z * [new branch] malfet-patch-9 -> origin/malfet-patch-9 2025-08-14T21:20:15.4147001Z * [new branch] malfet/delete-upsteam-cuda -> origin/malfet/delete-upsteam-cuda 2025-08-14T21:20:15.4151253Z * [new branch] malfet/mps-implement-col2im -> origin/malfet/mps-implement-col2im 2025-08-14T21:20:15.4153749Z * [new branch] manuel/fix_multidim_boolean_indexing -> origin/manuel/fix_multidim_boolean_indexing 2025-08-14T21:20:15.4155404Z * [new branch] manuel/np_empty_ellipsis -> origin/manuel/np_empty_ellipsis 2025-08-14T21:20:15.4157222Z * [new branch] manuel/test-ops-common-allow-mps -> origin/manuel/test-ops-common-allow-mps 2025-08-14T21:20:15.4159159Z * [new branch] metascroy-patch-1 -> origin/metascroy-patch-1 2025-08-14T21:20:15.4161742Z * [new branch] mlazos/S429861-debug -> origin/mlazos/S429861-debug 2025-08-14T21:20:15.4163441Z * [new branch] mlazos/aa -> origin/mlazos/aa 2025-08-14T21:20:15.4165333Z * [new branch] mlazos/arg-renames -> origin/mlazos/arg-renames 2025-08-14T21:20:15.4167328Z * [new branch] mlazos/backup-test-branch -> origin/mlazos/backup-test-branch 2025-08-14T21:20:15.4168933Z * [new branch] mlazos/bad-cudagraphs -> origin/mlazos/bad-cudagraphs 2025-08-14T21:20:15.4170588Z * [new branch] mlazos/baseline -> origin/mlazos/baseline 2025-08-14T21:20:15.4172355Z * [new branch] mlazos/baseline-graph-breaks -> origin/mlazos/baseline-graph-breaks 2025-08-14T21:20:15.4174018Z * [new branch] mlazos/beta-tensor -> origin/mlazos/beta-tensor 2025-08-14T21:20:15.4176169Z * [new branch] mlazos/buffers -> origin/mlazos/buffers 2025-08-14T21:20:15.4178320Z * [new branch] mlazos/buffers2 -> origin/mlazos/buffers2 2025-08-14T21:20:15.4180189Z * [new branch] mlazos/buffers3 -> origin/mlazos/buffers3 2025-08-14T21:20:15.4182447Z * [new branch] mlazos/ck2 -> origin/mlazos/ck2 2025-08-14T21:20:15.4184581Z * [new branch] mlazos/combokernels -> origin/mlazos/combokernels 2025-08-14T21:20:15.4186394Z * [new branch] mlazos/ctx-cleanup -> origin/mlazos/ctx-cleanup 2025-08-14T21:20:15.4188461Z * [new branch] mlazos/cudagraph-tests -> origin/mlazos/cudagraph-tests 2025-08-14T21:20:15.4190357Z * [new branch] mlazos/cudagraphs-measurement -> origin/mlazos/cudagraphs-measurement 2025-08-14T21:20:15.4192210Z * [new branch] mlazos/cutlass-test -> origin/mlazos/cutlass-test 2025-08-14T21:20:15.4194024Z * [new branch] mlazos/cutlass-topo-bug -> origin/mlazos/cutlass-topo-bug 2025-08-14T21:20:15.4195838Z * [new branch] mlazos/data-gather -> origin/mlazos/data-gather 2025-08-14T21:20:15.4197661Z * [new branch] mlazos/data-ptrs2 -> origin/mlazos/data-ptrs2 2025-08-14T21:20:15.4199452Z * [new branch] mlazos/data-ptrs3 -> origin/mlazos/data-ptrs3 2025-08-14T21:20:15.4201367Z * [new branch] mlazos/dataclass-proxy -> origin/mlazos/dataclass-proxy 2025-08-14T21:20:15.4203211Z * [new branch] mlazos/dc-attrs -> origin/mlazos/dc-attrs 2025-08-14T21:20:15.4204976Z * [new branch] mlazos/dc-helion -> origin/mlazos/dc-helion 2025-08-14T21:20:15.4206805Z * [new branch] mlazos/dict-fix -> origin/mlazos/dict-fix 2025-08-14T21:20:15.4208843Z * [new branch] mlazos/disable-closures -> origin/mlazos/disable-closures 2025-08-14T21:20:15.4210659Z * [new branch] mlazos/disable-tf -> origin/mlazos/disable-tf 2025-08-14T21:20:15.4212316Z * [new branch] mlazos/dupe-fix -> origin/mlazos/dupe-fix 2025-08-14T21:20:15.4214709Z * [new branch] mlazos/dyn-batch -> origin/mlazos/dyn-batch 2025-08-14T21:20:15.4216584Z * [new branch] mlazos/evt -> origin/mlazos/evt 2025-08-14T21:20:15.4218815Z * [new branch] mlazos/exp_disable -> origin/mlazos/exp_disable 2025-08-14T21:20:15.4220463Z * [new branch] mlazos/extract-examples -> origin/mlazos/extract-examples 2025-08-14T21:20:15.4222234Z * [new branch] mlazos/foreach-op -> origin/mlazos/foreach-op 2025-08-14T21:20:15.4224003Z * [new branch] mlazos/fp8 -> origin/mlazos/fp8 2025-08-14T21:20:15.4225881Z * [new branch] mlazos/fp8-bias -> origin/mlazos/fp8-bias 2025-08-14T21:20:15.4227754Z * [new branch] mlazos/fp8-bias-fusion -> origin/mlazos/fp8-bias-fusion 2025-08-14T21:20:15.4229612Z * [new branch] mlazos/freezing -> origin/mlazos/freezing 2025-08-14T21:20:15.4231917Z * [new branch] mlazos/h-comp -> origin/mlazos/h-comp 2025-08-14T21:20:15.4233800Z * [new branch] mlazos/h-comp2 -> origin/mlazos/h-comp2 2025-08-14T21:20:15.4235958Z * [new branch] mlazos/hash-hop -> origin/mlazos/hash-hop 2025-08-14T21:20:15.4237773Z * [new branch] mlazos/hc -> origin/mlazos/hc 2025-08-14T21:20:15.4239721Z * [new branch] mlazos/hc-cycles -> origin/mlazos/hc-cycles 2025-08-14T21:20:15.4241588Z * [new branch] mlazos/hc-fixes -> origin/mlazos/hc-fixes 2025-08-14T21:20:15.4243420Z * [new branch] mlazos/hc-fixes3 -> origin/mlazos/hc-fixes3 2025-08-14T21:20:15.4245415Z * [new branch] mlazos/hc-fixes4 -> origin/mlazos/hc-fixes4 2025-08-14T21:20:15.4247396Z * [new branch] mlazos/hc-hf -> origin/mlazos/hc-hf 2025-08-14T21:20:15.4249593Z * [new branch] mlazos/hc-mut -> origin/mlazos/hc-mut 2025-08-14T21:20:15.4251365Z * [new branch] mlazos/hc10 -> origin/mlazos/hc10 2025-08-14T21:20:15.4253222Z * [new branch] mlazos/hc11 -> origin/mlazos/hc11 2025-08-14T21:20:15.4255046Z * [new branch] mlazos/hc12 -> origin/mlazos/hc12 2025-08-14T21:20:15.4256906Z * [new branch] mlazos/hc13 -> origin/mlazos/hc13 2025-08-14T21:20:15.4258702Z * [new branch] mlazos/hc14 -> origin/mlazos/hc14 2025-08-14T21:20:15.4260562Z * [new branch] mlazos/hc15 -> origin/mlazos/hc15 2025-08-14T21:20:15.4262394Z * [new branch] mlazos/hc2 -> origin/mlazos/hc2 2025-08-14T21:20:15.4264262Z * [new branch] mlazos/hc4 -> origin/mlazos/hc4 2025-08-14T21:20:15.4266083Z * [new branch] mlazos/hc5 -> origin/mlazos/hc5 2025-08-14T21:20:15.4267900Z * [new branch] mlazos/hc6 -> origin/mlazos/hc6 2025-08-14T21:20:15.4269832Z * [new branch] mlazos/hc7 -> origin/mlazos/hc7 2025-08-14T21:20:15.4271575Z * [new branch] mlazos/hc8 -> origin/mlazos/hc8 2025-08-14T21:20:15.4273380Z * [new branch] mlazos/hc9 -> origin/mlazos/hc9 2025-08-14T21:20:15.4275307Z * [new branch] mlazos/hc_baseline2 -> origin/mlazos/hc_baseline2 2025-08-14T21:20:15.4277103Z * [new branch] mlazos/hop-modes -> origin/mlazos/hop-modes 2025-08-14T21:20:15.4279021Z * [new branch] mlazos/init-per-param -> origin/mlazos/init-per-param 2025-08-14T21:20:15.4280855Z * [new branch] mlazos/init_per_param -> origin/mlazos/init_per_param 2025-08-14T21:20:15.4282945Z * [new branch] mlazos/less-guards -> origin/mlazos/less-guards 2025-08-14T21:20:15.4284991Z * [new branch] mlazos/lr-composibility -> origin/mlazos/lr-composibility 2025-08-14T21:20:15.4286794Z * [new branch] mlazos/main -> origin/mlazos/main 2025-08-14T21:20:15.4289008Z * [new branch] mlazos/main-test-enablement -> origin/mlazos/main-test-enablement 2025-08-14T21:20:15.4290492Z * [new branch] mlazos/main2 -> origin/mlazos/main2 2025-08-14T21:20:15.4292293Z * [new branch] mlazos/mcg -> origin/mlazos/mcg 2025-08-14T21:20:15.4294142Z * [new branch] mlazos/mcg2 -> origin/mlazos/mcg2 2025-08-14T21:20:15.4296103Z * [new branch] mlazos/meta-guards -> origin/mlazos/meta-guards 2025-08-14T21:20:15.4298556Z * [new branch] mlazos/mlazos/ck2 -> origin/mlazos/mlazos/ck2 2025-08-14T21:20:15.4300507Z * [new branch] mlazos/mlazos/foreach-map-adam -> origin/mlazos/mlazos/foreach-map-adam 2025-08-14T21:20:15.4302390Z * [new branch] mlazos/mlazos/tf-mode-backup -> origin/mlazos/mlazos/tf-mode-backup 2025-08-14T21:20:15.4304177Z * [new branch] mlazos/mod-fix -> origin/mlazos/mod-fix 2025-08-14T21:20:15.4306203Z * [new branch] mlazos/mode-fix -> origin/mlazos/mode-fix 2025-08-14T21:20:15.4308019Z * [new branch] mlazos/more-tests -> origin/mlazos/more-tests 2025-08-14T21:20:15.4309924Z * [new branch] mlazos/nested-dc -> origin/mlazos/nested-dc 2025-08-14T21:20:15.4311763Z * [new branch] mlazos/no-cpp -> origin/mlazos/no-cpp 2025-08-14T21:20:15.4313718Z * [new branch] mlazos/no-init-group-handling -> origin/mlazos/no-init-group-handling 2025-08-14T21:20:15.4315416Z * [new branch] mlazos/offsets -> origin/mlazos/offsets 2025-08-14T21:20:15.4317255Z * [new branch] mlazos/opt-bench-exp2 -> origin/mlazos/opt-bench-exp2 2025-08-14T21:20:15.4319025Z * [new branch] mlazos/opt-incr -> origin/mlazos/opt-incr 2025-08-14T21:20:15.4320892Z * [new branch] mlazos/proxy-ctors -> origin/mlazos/proxy-ctors 2025-08-14T21:20:15.4322707Z * [new branch] mlazos/proxy-opt -> origin/mlazos/proxy-opt 2025-08-14T21:20:15.4324689Z * [new branch] mlazos/quant-fix -> origin/mlazos/quant-fix 2025-08-14T21:20:15.4326569Z * [new branch] mlazos/rm-buf-names -> origin/mlazos/rm-buf-names 2025-08-14T21:20:15.4328408Z * [new branch] mlazos/rm-spam -> origin/mlazos/rm-spam 2025-08-14T21:20:15.4330284Z * [new branch] mlazos/rtp -> origin/mlazos/rtp 2025-08-14T21:20:15.4332165Z * [new branch] mlazos/static-idx-dbg -> origin/mlazos/static-idx-dbg 2025-08-14T21:20:15.4334033Z * [new branch] mlazos/static-inputs-log -> origin/mlazos/static-inputs-log 2025-08-14T21:20:15.4335910Z * [new branch] mlazos/sub-param-fix -> origin/mlazos/sub-param-fix 2025-08-14T21:20:15.4337873Z * [new branch] mlazos/td-fix2 -> origin/mlazos/td-fix2 2025-08-14T21:20:15.4339698Z * [new branch] mlazos/tensor-hasattr2 -> origin/mlazos/tensor-hasattr2 2025-08-14T21:20:15.4342068Z * [new branch] mlazos/test -> origin/mlazos/test 2025-08-14T21:20:15.4343974Z * [new branch] mlazos/tf-mode -> origin/mlazos/tf-mode 2025-08-14T21:20:15.4345823Z * [new branch] mlazos/tf-mode-backup2 -> origin/mlazos/tf-mode-backup2 2025-08-14T21:20:15.4347762Z * [new branch] mlazos/tf-mode-reland -> origin/mlazos/tf-mode-reland 2025-08-14T21:20:15.4349871Z * [new branch] mlazos/tf-mode-reland2 -> origin/mlazos/tf-mode-reland2 2025-08-14T21:20:15.4351678Z * [new branch] mlazos/tf-mode-reland3 -> origin/mlazos/tf-mode-reland3 2025-08-14T21:20:15.4353607Z * [new branch] mlazos/topo-fix -> origin/mlazos/topo-fix 2025-08-14T21:20:15.4355531Z * [new branch] mlazos/triton-no-epi -> origin/mlazos/triton-no-epi 2025-08-14T21:20:15.4357341Z * [new branch] mlazos/tune-proto -> origin/mlazos/tune-proto 2025-08-14T21:20:15.4359571Z * [new branch] mlazos/tuple-fixes -> origin/mlazos/tuple-fixes 2025-08-14T21:20:15.4361486Z * [new branch] mlazos/tuple-fixes2 -> origin/mlazos/tuple-fixes2 2025-08-14T21:20:15.4363354Z * [new branch] mlazos/tuple-handling -> origin/mlazos/tuple-handling 2025-08-14T21:20:15.4365440Z * [new branch] mlazos/user-streams -> origin/mlazos/user-streams 2025-08-14T21:20:15.4367330Z * [new branch] mlazos/vary-beta -> origin/mlazos/vary-beta 2025-08-14T21:20:15.4369390Z * [new branch] mlazos/vary-beta2 -> origin/mlazos/vary-beta2 2025-08-14T21:20:15.4371221Z * [new branch] mlazos/weird-perf1 -> origin/mlazos/weird-perf1 2025-08-14T21:20:15.4373319Z * [new branch] mm_out_dtype_compile -> origin/mm_out_dtype_compile 2025-08-14T21:20:15.4375062Z * [new branch] modify-setupvllm -> origin/modify-setupvllm 2025-08-14T21:20:15.4377030Z * [new branch] move-theme-out-docker -> origin/move-theme-out-docker 2025-08-14T21:20:15.4378969Z * [new branch] mps-linear-1d -> origin/mps-linear-1d 2025-08-14T21:20:15.4381603Z * [new branch] msaroufim/be1 -> origin/msaroufim/be1 2025-08-14T21:20:15.4383388Z * [new branch] msaroufim/cn_path -> origin/msaroufim/cn_path 2025-08-14T21:20:15.4385302Z * [new branch] msaroufim/dtensorfusedadam -> origin/msaroufim/dtensorfusedadam 2025-08-14T21:20:15.4387058Z * [new branch] msaroufim/reduce -> origin/msaroufim/reduce 2025-08-14T21:20:15.4389560Z * [new branch] mtia/basic-cmake -> origin/mtia/basic-cmake 2025-08-14T21:20:15.4391499Z * [new branch] muon_dev -> origin/muon_dev 2025-08-14T21:20:15.4393442Z * [new branch] new-modifiy-setupvllm -> origin/new-modifiy-setupvllm 2025-08-14T21:20:15.4395394Z * [new branch] new-setupvllm -> origin/new-setupvllm 2025-08-14T21:20:15.4397383Z * [new branch] newtest-base -> origin/newtest-base 2025-08-14T21:20:15.4399895Z * [new branch] ngimel/cat_perf -> origin/ngimel/cat_perf 2025-08-14T21:20:15.4401660Z * [new branch] ngimel/cudamoduleload -> origin/ngimel/cudamoduleload 2025-08-14T21:20:15.4403367Z * [new branch] ngimel/fabric_driver_version -> origin/ngimel/fabric_driver_version 2025-08-14T21:20:15.4405195Z * [new branch] ngimel/fabric_symm -> origin/ngimel/fabric_symm 2025-08-14T21:20:15.4406878Z * [new branch] ngimel/gg_new -> origin/ngimel/gg_new 2025-08-14T21:20:15.4408613Z * [new branch] ngimel/grouped_mm_checks -> origin/ngimel/grouped_mm_checks 2025-08-14T21:20:15.4410317Z * [new branch] ngimel/guardfabric -> origin/ngimel/guardfabric 2025-08-14T21:20:15.4412018Z * [new branch] ngimel/index_None -> origin/ngimel/index_None 2025-08-14T21:20:15.4414115Z * [new branch] ngimel/modeguard -> origin/ngimel/modeguard 2025-08-14T21:20:15.4416360Z * [new branch] ngimel/multicast_fix -> origin/ngimel/multicast_fix 2025-08-14T21:20:15.4418295Z * [new branch] ngimel/unbind_multimem -> origin/ngimel/unbind_multimem 2025-08-14T21:20:15.4420170Z * [new branch] nightly -> origin/nightly 2025-08-14T21:20:15.4422230Z * [new branch] nmacchioni-patch-10 -> origin/nmacchioni-patch-10 2025-08-14T21:20:15.4424197Z * [new branch] nmacchioni-patch-7 -> origin/nmacchioni-patch-7 2025-08-14T21:20:15.4426278Z * [new branch] nmacchioni-patch-8 -> origin/nmacchioni-patch-8 2025-08-14T21:20:15.4428276Z * [new branch] nmacchioni-patch-9 -> origin/nmacchioni-patch-9 2025-08-14T21:20:15.4430259Z * [new branch] nullplay_fuse_matmul -> origin/nullplay_fuse_matmul 2025-08-14T21:20:15.4433485Z * [new branch] nweidia/enable-B200-inductor-nightly-ci -> origin/nweidia/enable-B200-inductor-nightly-ci 2025-08-14T21:20:15.4435225Z * [new branch] one-off -> origin/one-off 2025-08-14T21:20:15.4438392Z * [new branch] orig/release/1.10 -> origin/orig/release/1.10 2025-08-14T21:20:15.4440282Z * [new branch] orig/release/1.11 -> origin/orig/release/1.11 2025-08-14T21:20:15.4442133Z * [new branch] orig/release/1.12 -> origin/orig/release/1.12 2025-08-14T21:20:15.4444178Z * [new branch] orig/release/1.13 -> origin/orig/release/1.13 2025-08-14T21:20:15.4446346Z * [new branch] orig/release/1.6 -> origin/orig/release/1.6 2025-08-14T21:20:15.4448651Z * [new branch] orig/release/1.7 -> origin/orig/release/1.7 2025-08-14T21:20:15.4450486Z * [new branch] orig/release/1.8 -> origin/orig/release/1.8 2025-08-14T21:20:15.4452379Z * [new branch] orig/release/1.9 -> origin/orig/release/1.9 2025-08-14T21:20:15.4454845Z * [new branch] orig/release/2.0 -> origin/orig/release/2.0 2025-08-14T21:20:15.4456747Z * [new branch] orig/release/2.1 -> origin/orig/release/2.1 2025-08-14T21:20:15.4458658Z * [new branch] orig/release/2.2 -> origin/orig/release/2.2 2025-08-14T21:20:15.4460407Z * [new branch] orig/release/2.3 -> origin/orig/release/2.3 2025-08-14T21:20:15.4462217Z * [new branch] orig/release/2.4 -> origin/orig/release/2.4 2025-08-14T21:20:15.4464037Z * [new branch] orig/release/2.5 -> origin/orig/release/2.5 2025-08-14T21:20:15.4466748Z * [new branch] orig/release/2.6 -> origin/orig/release/2.6 2025-08-14T21:20:15.4469009Z * [new branch] orig/release/2.7 -> origin/orig/release/2.7 2025-08-14T21:20:15.4470396Z * [new branch] orig/release/2.8 -> origin/orig/release/2.8 2025-08-14T21:20:15.4473599Z * [new branch] oulgen/fx_graph -> origin/oulgen/fx_graph 2025-08-14T21:20:15.4475502Z * [new branch] padded-tensor -> origin/padded-tensor 2025-08-14T21:20:15.4477856Z * [new branch] parallel_cat -> origin/parallel_cat 2025-08-14T21:20:15.4479779Z * [new branch] pca2 -> origin/pca2 2025-08-14T21:20:15.4481858Z * [new branch] pianpwk-patch-1 -> origin/pianpwk-patch-1 2025-08-14T21:20:15.4484704Z * [new branch] pianpwk/backed_size_oblivious_export -> origin/pianpwk/backed_size_oblivious_export 2025-08-14T21:20:15.4486468Z * [new branch] pianpwk/dde_repeat_cat -> origin/pianpwk/dde_repeat_cat 2025-08-14T21:20:15.4488273Z * [new branch] pianpwk/draft_export_normalize -> origin/pianpwk/draft_export_normalize 2025-08-14T21:20:15.4489948Z * [new branch] pianpwk/dynamic_source_dim -> origin/pianpwk/dynamic_source_dim 2025-08-14T21:20:15.4491602Z * [new branch] pianpwk/invalidate_fake_memo -> origin/pianpwk/invalidate_fake_memo 2025-08-14T21:20:15.4493701Z * [new branch] pianpwk/lru_cache_bound_sympy -> origin/pianpwk/lru_cache_bound_sympy 2025-08-14T21:20:15.4495893Z * [new branch] pianpwk/max_1_strides -> origin/pianpwk/max_1_strides 2025-08-14T21:20:15.4497702Z * [new branch] pianpwk/nonzero_memo -> origin/pianpwk/nonzero_memo 2025-08-14T21:20:15.4499668Z * [new branch] pianpwk/oblivious_reshape_view_better -> origin/pianpwk/oblivious_reshape_view_better 2025-08-14T21:20:15.4501428Z * [new branch] pianpwk/oblivious_should_swap -> origin/pianpwk/oblivious_should_swap 2025-08-14T21:20:15.4503227Z * [new branch] pianpwk/oblivious_slice_forward -> origin/pianpwk/oblivious_slice_forward 2025-08-14T21:20:15.4505116Z * [new branch] pianpwk/oblivious_where -> origin/pianpwk/oblivious_where 2025-08-14T21:20:15.4506974Z * [new branch] pianpwk/param_static_pgo -> origin/pianpwk/param_static_pgo 2025-08-14T21:20:15.4508829Z * [new branch] pianpwk/pre_forward_hook -> origin/pianpwk/pre_forward_hook 2025-08-14T21:20:15.4510789Z * [new branch] pianpwk/remove_guard_fail_break -> origin/pianpwk/remove_guard_fail_break 2025-08-14T21:20:15.4512565Z * [new branch] pianpwk/slice_fresh_symbols -> origin/pianpwk/slice_fresh_symbols 2025-08-14T21:20:15.4514526Z * [new branch] pianpwk/sym_sym -> origin/pianpwk/sym_sym 2025-08-14T21:20:15.4516215Z * [new branch] pianpwk/test_slice_fake_impl -> origin/pianpwk/test_slice_fake_impl 2025-08-14T21:20:15.4518044Z * [new branch] pianpwk/unbacked_channels_last -> origin/pianpwk/unbacked_channels_last 2025-08-14T21:20:15.4519896Z * [new branch] pianpwk/unbacked_safe_conv1d -> origin/pianpwk/unbacked_safe_conv1d 2025-08-14T21:20:15.4521776Z * [new branch] pianpwk/unbacked_sdpa_flash -> origin/pianpwk/unbacked_sdpa_flash 2025-08-14T21:20:15.4523752Z * [new branch] pianpwk/unbacked_should_swap -> origin/pianpwk/unbacked_should_swap 2025-08-14T21:20:15.4525748Z * [new branch] pianpwk/unbacked_should_swap_2 -> origin/pianpwk/unbacked_should_swap_2 2025-08-14T21:20:15.4527566Z * [new branch] pianpwk/unbacked_slice_binding -> origin/pianpwk/unbacked_slice_binding 2025-08-14T21:20:15.4529362Z * [new branch] pianpwk/unbacked_slice_forward -> origin/pianpwk/unbacked_slice_forward 2025-08-14T21:20:15.4531146Z * [new branch] pianpwk/verbose_tensor_guards -> origin/pianpwk/verbose_tensor_guards 2025-08-14T21:20:15.4532910Z * [new branch] pianpwk/wan21_reshape -> origin/pianpwk/wan21_reshape 2025-08-14T21:20:15.4535014Z * [new branch] pianpwk/whitelist_optimizer -> origin/pianpwk/whitelist_optimizer 2025-08-14T21:20:15.4536960Z * [new branch] pin-torchao -> origin/pin-torchao 2025-08-14T21:20:15.4540036Z * [new branch] piz/fall_back_missing_0705 -> origin/piz/fall_back_missing_0705 2025-08-14T21:20:15.4541782Z * [new branch] piz/fall_back_missing_0716 -> origin/piz/fall_back_missing_0716 2025-08-14T21:20:15.4543582Z * [new branch] piz/fill_dist_cost_0702-3 -> origin/piz/fill_dist_cost_0702-3 2025-08-14T21:20:15.4545310Z * [new branch] piz/fill_dist_cost_0702-4 -> origin/piz/fill_dist_cost_0702-4 2025-08-14T21:20:15.4547263Z * [new branch] piz/fill_dist_cost_0702-5 -> origin/piz/fill_dist_cost_0702-5 2025-08-14T21:20:15.4552247Z * [new branch] piz/fix_sort_ -> origin/piz/fix_sort_ 2025-08-14T21:20:15.4554390Z * [new branch] piz/improve_scatter_0808 -> origin/piz/improve_scatter_0808 2025-08-14T21:20:15.4556287Z * [new branch] pool-separate -> origin/pool-separate 2025-08-14T21:20:15.4558216Z * [new branch] pr-156087 -> origin/pr-156087 2025-08-14T21:20:15.4560800Z * [new branch] pr/131860 -> origin/pr/131860 2025-08-14T21:20:15.4562645Z * [new branch] predispatch_to -> origin/predispatch_to 2025-08-14T21:20:15.4565251Z * [new branch] pt-opt-cuda3 -> origin/pt-opt-cuda3 2025-08-14T21:20:15.4567247Z * [new branch] pt2e-cache-model-device -> origin/pt2e-cache-model-device 2025-08-14T21:20:15.4569313Z * [new branch] pull-latest-theme -> origin/pull-latest-theme 2025-08-14T21:20:15.4571315Z * [new branch] pyobjectslot -> origin/pyobjectslot 2025-08-14T21:20:15.4573564Z * [new branch] python_compiled_autograd -> origin/python_compiled_autograd 2025-08-14T21:20:15.4576676Z * [new branch] qchip/export-D54134695 -> origin/qchip/export-D54134695 2025-08-14T21:20:15.4578617Z * [new branch] quint-bits -> origin/quint-bits 2025-08-14T21:20:15.4581283Z * [new branch] release/1.10 -> origin/release/1.10 2025-08-14T21:20:15.4583112Z * [new branch] release/1.11 -> origin/release/1.11 2025-08-14T21:20:15.4585175Z * [new branch] release/1.12 -> origin/release/1.12 2025-08-14T21:20:15.4587118Z * [new branch] release/1.13 -> origin/release/1.13 2025-08-14T21:20:15.4589019Z * [new branch] release/1.4 -> origin/release/1.4 2025-08-14T21:20:15.4590436Z * [new branch] release/1.4.1 -> origin/release/1.4.1 2025-08-14T21:20:15.4592264Z * [new branch] release/1.5 -> origin/release/1.5 2025-08-14T21:20:15.4594196Z * [new branch] release/1.6 -> origin/release/1.6 2025-08-14T21:20:15.4595986Z * [new branch] release/1.7 -> origin/release/1.7 2025-08-14T21:20:15.4597987Z * [new branch] release/1.8 -> origin/release/1.8 2025-08-14T21:20:15.4599719Z * [new branch] release/1.9 -> origin/release/1.9 2025-08-14T21:20:15.4601597Z * [new branch] release/2.0 -> origin/release/2.0 2025-08-14T21:20:15.4603504Z * [new branch] release/2.1 -> origin/release/2.1 2025-08-14T21:20:15.4605604Z * [new branch] release/2.2 -> origin/release/2.2 2025-08-14T21:20:15.4607851Z * [new branch] release/2.3 -> origin/release/2.3 2025-08-14T21:20:15.4610645Z * [new branch] release/2.4 -> origin/release/2.4 2025-08-14T21:20:15.4613120Z * [new branch] release/2.5 -> origin/release/2.5 2025-08-14T21:20:15.4615072Z * [new branch] release/2.6 -> origin/release/2.6 2025-08-14T21:20:15.4617044Z * [new branch] release/2.7 -> origin/release/2.7 2025-08-14T21:20:15.4618970Z * [new branch] release/2.8 -> origin/release/2.8 2025-08-14T21:20:15.4620901Z * [new branch] release_notes -> origin/release_notes 2025-08-14T21:20:15.4622849Z * [new branch] remove-actionable-label -> origin/remove-actionable-label 2025-08-14T21:20:15.4624699Z * [new branch] remove-ao -> origin/remove-ao 2025-08-14T21:20:15.4626839Z * [new branch] replace-pytorch-labs-20250812-195836 -> origin/replace-pytorch-labs-20250812-195836 2025-08-14T21:20:15.4628588Z * [new branch] replace-pytorch-labs-20250812-200248 -> origin/replace-pytorch-labs-20250812-200248 2025-08-14T21:20:15.4630330Z * [new branch] replace-pytorch-labs-20250812-200324 -> origin/replace-pytorch-labs-20250812-200324 2025-08-14T21:20:15.4632710Z * [new branch] replace-pytorch-labs-20250812-204020 -> origin/replace-pytorch-labs-20250812-204020 2025-08-14T21:20:15.4634583Z * [new branch] replace-pytorch-labs-20250812-204125 -> origin/replace-pytorch-labs-20250812-204125 2025-08-14T21:20:15.4636494Z * [new branch] replace-pytorch-labs-20250812-205624 -> origin/replace-pytorch-labs-20250812-205624 2025-08-14T21:20:15.4640336Z * [new branch] revert-131069-gh/krzysztofjordan/1/head -> origin/revert-131069-gh/krzysztofjordan/1/head 2025-08-14T21:20:15.4644038Z * [new branch] revert-131469-gh/andrewor14/51/head -> origin/revert-131469-gh/andrewor14/51/head 2025-08-14T21:20:15.4648110Z * [new branch] revert-156870-gh/skarjala/3/head -> origin/revert-156870-gh/skarjala/3/head 2025-08-14T21:20:15.4650409Z * [new branch] revert-157914-cherry-pick-157503-by-pytorch_bot_bot_ -> origin/revert-157914-cherry-pick-157503-by-pytorch_bot_bot_ 2025-08-14T21:20:15.4652132Z * [new branch] revert-direct-updates -> origin/revert-direct-updates 2025-08-14T21:20:15.4654060Z * [new branch] rocm-monitoring -> origin/rocm-monitoring 2025-08-14T21:20:15.4657055Z * [new branch] ryanguo99/cleanup-dynamo-expected-failures -> origin/ryanguo99/cleanup-dynamo-expected-failures 2025-08-14T21:20:15.4658625Z * [new branch] ryanguo99/fix-closure-var -> origin/ryanguo99/fix-closure-var 2025-08-14T21:20:15.4661115Z * [new branch] rzou/faketensor_bench -> origin/rzou/faketensor_bench 2025-08-14T21:20:15.4662937Z * [new branch] rzou/njt -> origin/rzou/njt 2025-08-14T21:20:15.4664583Z * [new branch] rzou/operator -> origin/rzou/operator 2025-08-14T21:20:15.4666337Z * [new branch] rzou/pca -> origin/rzou/pca 2025-08-14T21:20:15.4668119Z * [new branch] rzou/pipe_split -> origin/rzou/pipe_split 2025-08-14T21:20:15.4669838Z * [new branch] rzou/realprop -> origin/rzou/realprop 2025-08-14T21:20:15.4672071Z * [new branch] rzou/setup_context -> origin/rzou/setup_context 2025-08-14T21:20:15.4674827Z * [new branch] sanchitintel/refactor_aten_int8_woq_gemm -> origin/sanchitintel/refactor_aten_int8_woq_gemm 2025-08-14T21:20:15.4676787Z * [new branch] sanchitintel/weird_thing_with_test_cpu_select_algorithm -> origin/sanchitintel/weird_thing_with_test_cpu_select_algorithm 2025-08-14T21:20:15.4678622Z * [new branch] sapling-pr-archive-SS-JIA -> origin/sapling-pr-archive-SS-JIA 2025-08-14T21:20:15.4680390Z * [new branch] save -> origin/save 2025-08-14T21:20:15.4683011Z * [new branch] sdym/2.5.1 -> origin/sdym/2.5.1 2025-08-14T21:20:15.4685231Z * [new branch] seemethere-patch-1 -> origin/seemethere-patch-1 2025-08-14T21:20:15.4687103Z * [new branch] setup-torchci -> origin/setup-torchci 2025-08-14T21:20:15.4688982Z * [new branch] setupvllm -> origin/setupvllm 2025-08-14T21:20:15.4690855Z * [new branch] share_and_pin_fork -> origin/share_and_pin_fork 2025-08-14T21:20:15.4693367Z * [new branch] shengf/fx-xform-perf -> origin/shengf/fx-xform-perf 2025-08-14T21:20:15.4695496Z * [new branch] shikaili_fp8_allgather -> origin/shikaili_fp8_allgather 2025-08-14T21:20:15.4697296Z * [new branch] shoumikhin-patch-12 -> origin/shoumikhin-patch-12 2025-08-14T21:20:15.4699227Z * [new branch] simplify-fq-per-channel -> origin/simplify-fq-per-channel 2025-08-14T21:20:15.4701125Z * [new branch] solve-accuracy-fix -> origin/solve-accuracy-fix 2025-08-14T21:20:15.4703798Z * [new branch] sqzhang/flight4 -> origin/sqzhang/flight4 2025-08-14T21:20:15.4705582Z * [new branch] sqzhang/flight4plus -> origin/sqzhang/flight4plus 2025-08-14T21:20:15.4708058Z * [new branch] sraikund/record_funct_test -> origin/sraikund/record_funct_test 2025-08-14T21:20:15.4710731Z * [new branch] sraikund16/test -> origin/sraikund16/test 2025-08-14T21:20:15.4712697Z * [new branch] stablize-compilation-time -> origin/stablize-compilation-time 2025-08-14T21:20:15.4714585Z * [new branch] standalone-templates -> origin/standalone-templates 2025-08-14T21:20:15.4716594Z * [new branch] standalone_package_weights -> origin/standalone_package_weights 2025-08-14T21:20:15.4718434Z * [new branch] starterTaskUpdate -> origin/starterTaskUpdate 2025-08-14T21:20:15.4720348Z * [new branch] step2vllmsetup -> origin/step2vllmsetup 2025-08-14T21:20:15.4722233Z * [new branch] subgraph_fuse -> origin/subgraph_fuse 2025-08-14T21:20:15.4724295Z * [new branch] support-uv-in-collect_env -> origin/support-uv-in-collect_env 2025-08-14T21:20:15.4727477Z * [new branch] suryasub/fix-nccl-hang -> origin/suryasub/fix-nccl-hang 2025-08-14T21:20:15.4729267Z * [new branch] sve-poc -> origin/sve-poc 2025-08-14T21:20:15.4731167Z * [new branch] svekars-patch-1 -> origin/svekars-patch-1 2025-08-14T21:20:15.4733027Z * [new branch] svekars-patch-2 -> origin/svekars-patch-2 2025-08-14T21:20:15.4735030Z * [new branch] switch-bn -> origin/switch-bn 2025-08-14T21:20:15.4736980Z * [new branch] sympy-bottleneck-repro -> origin/sympy-bottleneck-repro 2025-08-14T21:20:15.4739538Z * [new branch] tenpercent/ck_inductor_gfx950 -> origin/tenpercent/ck_inductor_gfx950 2025-08-14T21:20:15.4741512Z * [new branch] tensordict_integration -> origin/tensordict_integration 2025-08-14T21:20:15.4743965Z * [new branch] test-half-migration-internally -> origin/test-half-migration-internally 2025-08-14T21:20:15.4745838Z * [new branch] test-internal-et -> origin/test-internal-et 2025-08-14T21:20:15.4747992Z * [new branch] test-move-conda-builds -> origin/test-move-conda-builds 2025-08-14T21:20:15.4749975Z * [new branch] test-myst-markdown-docstring -> origin/test-myst-markdown-docstring 2025-08-14T21:20:15.4751785Z * [new branch] test-old -> origin/test-old 2025-08-14T21:20:15.4753718Z * [new branch] test-vec-migration-internally -> origin/test-vec-migration-internally 2025-08-14T21:20:15.4756186Z * [new branch] test/bmm_heur -> origin/test/bmm_heur 2025-08-14T21:20:15.4757900Z * [new branch] test/inductor -> origin/test/inductor 2025-08-14T21:20:15.4759794Z * [new branch] tidy_performance_cyy -> origin/tidy_performance_cyy 2025-08-14T21:20:15.4761692Z * [new branch] torchtitan_ep -> origin/torchtitan_ep 2025-08-14T21:20:15.4763659Z * [new branch] trace_fsdp_torchtune_lora -> origin/trace_fsdp_torchtune_lora 2025-08-14T21:20:15.4765638Z * [new branch] traceable_fsdp_unit_tests -> origin/traceable_fsdp_unit_tests 2025-08-14T21:20:15.4767577Z * [new branch] trackMonitor -> origin/trackMonitor 2025-08-14T21:20:15.4769510Z * [new branch] tree_loop_vec_base -> origin/tree_loop_vec_base 2025-08-14T21:20:15.4771421Z * [new branch] tree_vec_base -> origin/tree_vec_base 2025-08-14T21:20:15.4773361Z * [new branch] triton-update -> origin/triton-update 2025-08-14T21:20:15.4775521Z * [new branch] triton_kernel -> origin/triton_kernel 2025-08-14T21:20:15.4777301Z * [new branch] triton_kernel_perf -> origin/triton_kernel_perf 2025-08-14T21:20:15.4779180Z * [new branch] try-runllm -> origin/try-runllm 2025-08-14T21:20:15.4781092Z * [new branch] type_dec -> origin/type_dec 2025-08-14T21:20:15.4783057Z * [new branch] udate-sphinx-dependancies -> origin/udate-sphinx-dependancies 2025-08-14T21:20:15.4785607Z * [new branch] update-audio-commit-hash/16307312222-1661-1 -> origin/update-audio-commit-hash/16307312222-1661-1 2025-08-14T21:20:15.4787325Z * [new branch] update-audio-commit-hash/16431348808-1673-1 -> origin/update-audio-commit-hash/16431348808-1673-1 2025-08-14T21:20:15.4788966Z * [new branch] update-audio-commit-hash/16510774365-1683-1 -> origin/update-audio-commit-hash/16510774365-1683-1 2025-08-14T21:20:15.4790691Z * [new branch] update-audio-commit-hash/16583472358-1693-1 -> origin/update-audio-commit-hash/16583472358-1693-1 2025-08-14T21:20:15.4792389Z * [new branch] update-audio-commit-hash/16663082088-1700-1 -> origin/update-audio-commit-hash/16663082088-1700-1 2025-08-14T21:20:15.4794390Z * [new branch] update-audio-commit-hash/16737365217-1704-1 -> origin/update-audio-commit-hash/16737365217-1704-1 2025-08-14T21:20:15.4796579Z * [new branch] update-audio-commit-hash/16791960928-1711-1 -> origin/update-audio-commit-hash/16791960928-1711-1 2025-08-14T21:20:15.4798805Z * [new branch] update-audio-commit-hash/16818882925-1712-1 -> origin/update-audio-commit-hash/16818882925-1712-1 2025-08-14T21:20:15.4800756Z * [new branch] update-audio-commit-hash/16895560422-1720-1 -> origin/update-audio-commit-hash/16895560422-1720-1 2025-08-14T21:20:15.4802444Z * [new branch] update-audio-commit-hash/16924174496-1738-1 -> origin/update-audio-commit-hash/16924174496-1738-1 2025-08-14T21:20:15.4804171Z * [new branch] update-dynamic-shapes-doc -> origin/update-dynamic-shapes-doc 2025-08-14T21:20:15.4807048Z * [new branch] update-executorch-commit-hash/15694981040-1626-1 -> origin/update-executorch-commit-hash/15694981040-1626-1 2025-08-14T21:20:15.4809439Z * [new branch] update-triton-commit-hash/13663274526-1487-2 -> origin/update-triton-commit-hash/13663274526-1487-2 2025-08-14T21:20:15.4811830Z * [new branch] update-vision-commit-hash/15336342773-1607-1 -> origin/update-vision-commit-hash/15336342773-1607-1 2025-08-14T21:20:15.4814235Z * [new branch] update-vllm-commit-hash/16431348808-1673-1 -> origin/update-vllm-commit-hash/16431348808-1673-1 2025-08-14T21:20:15.4815971Z * [new branch] update-vllm-commit-hash/16484773233-1682-1 -> origin/update-vllm-commit-hash/16484773233-1682-1 2025-08-14T21:20:15.4817658Z * [new branch] update-vllm-commit-hash/16510774365-1683-1 -> origin/update-vllm-commit-hash/16510774365-1683-1 2025-08-14T21:20:15.4819396Z * [new branch] update-vllm-commit-hash/16534031105-1684-1 -> origin/update-vllm-commit-hash/16534031105-1684-1 2025-08-14T21:20:15.4821180Z * [new branch] update-vllm-commit-hash/16545403308-1687-1 -> origin/update-vllm-commit-hash/16545403308-1687-1 2025-08-14T21:20:15.4823224Z * [new branch] update-vllm-commit-hash/16557202787-1688-1 -> origin/update-vllm-commit-hash/16557202787-1688-1 2025-08-14T21:20:15.4825348Z * [new branch] update-vllm-commit-hash/16583472358-1693-1 -> origin/update-vllm-commit-hash/16583472358-1693-1 2025-08-14T21:20:15.4827444Z * [new branch] update-vllm-commit-hash/16663082088-1700-1 -> origin/update-vllm-commit-hash/16663082088-1700-1 2025-08-14T21:20:15.4829333Z * [new branch] update-vllm-commit-hash/16737365217-1704-1 -> origin/update-vllm-commit-hash/16737365217-1704-1 2025-08-14T21:20:15.4831071Z * [new branch] update-vllm-commit-hash/16843157111-1713-1 -> origin/update-vllm-commit-hash/16843157111-1713-1 2025-08-14T21:20:15.4832839Z * [new branch] update-vllm-commit-hash/16855312394-1714-1 -> origin/update-vllm-commit-hash/16855312394-1714-1 2025-08-14T21:20:15.4834601Z * [new branch] update-vllm-commit-hash/16924174496-1738-1 -> origin/update-vllm-commit-hash/16924174496-1738-1 2025-08-14T21:20:15.4836403Z * [new branch] update-vllm-commit-hash/16952608705-1745-1 -> origin/update-vllm-commit-hash/16952608705-1745-1 2025-08-14T21:20:15.4838872Z * [new branch] update-xla-commit-hash/16260974441-194-1 -> origin/update-xla-commit-hash/16260974441-194-1 2025-08-14T21:20:15.4840594Z * [new branch] update-xla-commit-hash/16717126778-197-1 -> origin/update-xla-commit-hash/16717126778-197-1 2025-08-14T21:20:15.4842279Z * [new branch] update-xla-commit-hash/16873912760-198-1 -> origin/update-xla-commit-hash/16873912760-198-1 2025-08-14T21:20:15.4844263Z * [new branch] update_docs_torch_multinomial_issue#125388 -> origin/update_docs_torch_multinomial_issue#125388 2025-08-14T21:20:15.4846176Z * [new branch] update_executorch_pin -> origin/update_executorch_pin 2025-08-14T21:20:15.4848398Z * [new branch] update_slow_tests_1722488736 -> origin/update_slow_tests_1722488736 2025-08-14T21:20:15.4850955Z * [new branch] update_slow_tests_1722879173 -> origin/update_slow_tests_1722879173 2025-08-14T21:20:15.4852854Z * [new branch] update_slow_tests_1752478971 -> origin/update_slow_tests_1752478971 2025-08-14T21:20:15.4855175Z * [new branch] update_submodule_FBGEMM -> origin/update_submodule_FBGEMM 2025-08-14T21:20:15.4856866Z * [new branch] update_submodule_kineto -> origin/update_submodule_kineto 2025-08-14T21:20:15.4858751Z * [new branch] update_submodule_tensorpipe -> origin/update_submodule_tensorpipe 2025-08-14T21:20:15.4860690Z * [new branch] v0.1.2 -> origin/v0.1.2 2025-08-14T21:20:15.4862777Z * [new branch] v1.0.1 -> origin/v1.0.1 2025-08-14T21:20:15.4864891Z * [new branch] v1.0.3 -> origin/v1.0.3 2025-08-14T21:20:15.4866932Z * [new branch] v1.1.0 -> origin/v1.1.0 2025-08-14T21:20:15.4868951Z * [new branch] v1.2.0 -> origin/v1.2.0 2025-08-14T21:20:15.4870928Z * [new branch] v1.3.0 -> origin/v1.3.0 2025-08-14T21:20:15.4872954Z * [new branch] v1.3.1 -> origin/v1.3.1 2025-08-14T21:20:15.4874982Z * [new branch] validate_fn -> origin/validate_fn 2025-08-14T21:20:15.4877169Z * [new branch] validations_2.6 -> origin/validations_2.6 2025-08-14T21:20:15.4879140Z * [new branch] validations_2.8 -> origin/validations_2.8 2025-08-14T21:20:15.4881914Z * [new branch] viable/strict -> origin/viable/strict 2025-08-14T21:20:15.4883850Z * [new branch] vllmbuildci -> origin/vllmbuildci 2025-08-14T21:20:15.4885953Z * [new branch] vllmpin -> origin/vllmpin 2025-08-14T21:20:15.4887908Z * [new branch] vllmpintest -> origin/vllmpintest 2025-08-14T21:20:15.4889932Z * [new branch] wdvr-patch-1 -> origin/wdvr-patch-1 2025-08-14T21:20:15.4891969Z * [new branch] wdvr-patch-2 -> origin/wdvr-patch-2 2025-08-14T21:20:15.4894533Z * [new branch] wdvr/conda_devcontainer -> origin/wdvr/conda_devcontainer 2025-08-14T21:20:15.4896327Z * [new branch] wdvr/fix_logging_test -> origin/wdvr/fix_logging_test 2025-08-14T21:20:15.4897981Z * [new branch] wdvr/iss_145259 -> origin/wdvr/iss_145259 2025-08-14T21:20:15.4899937Z * [new branch] weight_sharing_cpp -> origin/weight_sharing_cpp 2025-08-14T21:20:15.4902522Z * [new branch] whc/flight -> origin/whc/flight 2025-08-14T21:20:15.4904500Z * [new branch] whc/flight4 -> origin/whc/flight4 2025-08-14T21:20:15.4906303Z * [new branch] whc/flight51 -> origin/whc/flight51 2025-08-14T21:20:15.4908100Z * [new branch] whc/flight53 -> origin/whc/flight53 2025-08-14T21:20:15.4909882Z * [new branch] whc/p2phang -> origin/whc/p2phang 2025-08-14T21:20:15.4911757Z * [new branch] whc/stage2 -> origin/whc/stage2 2025-08-14T21:20:15.4913444Z * [new branch] whc/uneven -> origin/whc/uneven 2025-08-14T21:20:15.4915583Z * [new branch] whc/uneven-merge -> origin/whc/uneven-merge 2025-08-14T21:20:15.4917451Z * [new branch] win_warnings -> origin/win_warnings 2025-08-14T21:20:15.4919265Z * [new branch] workonoldcommit -> origin/workonoldcommit 2025-08-14T21:20:15.4921924Z * [new branch] wwen/programming-model-2.8 -> origin/wwen/programming-model-2.8 2025-08-14T21:20:15.4924417Z * [new branch] xmfan/ca_0516 -> origin/xmfan/ca_0516 2025-08-14T21:20:15.4926241Z * [new branch] xmfan/ca_1051b93192 -> origin/xmfan/ca_1051b93192 2025-08-14T21:20:15.4928059Z * [new branch] xmfan/ca_1a722f62c248391fc4a542e8851a5559aa356ae8 -> origin/xmfan/ca_1a722f62c248391fc4a542e8851a5559aa356ae8 2025-08-14T21:20:15.4929673Z * [new branch] xmfan/ca_5a2be192d1 -> origin/xmfan/ca_5a2be192d1 2025-08-14T21:20:15.4931311Z * [new branch] xmfan/ca_9d59b516e9 -> origin/xmfan/ca_9d59b516e9 2025-08-14T21:20:15.4933106Z * [new branch] xmfan/ca_api -> origin/xmfan/ca_api 2025-08-14T21:20:15.4934821Z * [new branch] xmfan/ca_apr8 -> origin/xmfan/ca_apr8 2025-08-14T21:20:15.4936948Z * [new branch] xmfan/ca_base -> origin/xmfan/ca_base 2025-08-14T21:20:15.4939152Z * [new branch] xmfan/ca_cudagraphs -> origin/xmfan/ca_cudagraphs 2025-08-14T21:20:15.4941033Z * [new branch] xmfan/ca_dynamic -> origin/xmfan/ca_dynamic 2025-08-14T21:20:15.4942857Z * [new branch] xmfan/ca_fix_dyn -> origin/xmfan/ca_fix_dyn 2025-08-14T21:20:15.4944726Z * [new branch] xmfan/ca_fix_lowering -> origin/xmfan/ca_fix_lowering 2025-08-14T21:20:15.4946530Z * [new branch] xmfan/ca_fix_polyfills -> origin/xmfan/ca_fix_polyfills 2025-08-14T21:20:15.4950661Z * [new branch] xmfan/ca_jan3 -> origin/xmfan/ca_jan3 2025-08-14T21:20:15.4952635Z * [new branch] xmfan/ca_jun18 -> origin/xmfan/ca_jun18 2025-08-14T21:20:15.4954541Z * [new branch] xmfan/ca_jun24 -> origin/xmfan/ca_jun24 2025-08-14T21:20:15.4956342Z * [new branch] xmfan/ca_mem_base -> origin/xmfan/ca_mem_base 2025-08-14T21:20:15.4958164Z * [new branch] xmfan/ca_mem_fix -> origin/xmfan/ca_mem_fix 2025-08-14T21:20:15.4959980Z * [new branch] xmfan/ca_memory_fix -> origin/xmfan/ca_memory_fix 2025-08-14T21:20:15.4961709Z * [new branch] xmfan/ca_memory_fix_rebased -> origin/xmfan/ca_memory_fix_rebased 2025-08-14T21:20:15.4963519Z * [new branch] xmfan/ca_memory_fix_rebased2 -> origin/xmfan/ca_memory_fix_rebased2 2025-08-14T21:20:15.4965499Z * [new branch] xmfan/ca_move_to_cuda -> origin/xmfan/ca_move_to_cuda 2025-08-14T21:20:15.4967315Z * [new branch] xmfan/ca_nested -> origin/xmfan/ca_nested 2025-08-14T21:20:15.4969252Z * [new branch] xmfan/ca_overhead -> origin/xmfan/ca_overhead 2025-08-14T21:20:15.4971102Z * [new branch] xmfan/ca_overhead_0eba7e5451 -> origin/xmfan/ca_overhead_0eba7e5451 2025-08-14T21:20:15.4972910Z * [new branch] xmfan/ca_scalar -> origin/xmfan/ca_scalar 2025-08-14T21:20:15.4974777Z * [new branch] xmfan/ca_subclass_mem_fix -> origin/xmfan/ca_subclass_mem_fix 2025-08-14T21:20:15.4976564Z * [new branch] xmfan/ca_warm_mem -> origin/xmfan/ca_warm_mem 2025-08-14T21:20:15.4978374Z * [new branch] xmfan/ca_warm_mem_base -> origin/xmfan/ca_warm_mem_base 2025-08-14T21:20:15.4980338Z * [new branch] xmfan/cacu_jun18 -> origin/xmfan/cacu_jun18 2025-08-14T21:20:15.4982172Z * [new branch] xmfan/cacu_jun19 -> origin/xmfan/cacu_jun19 2025-08-14T21:20:15.4984019Z * [new branch] xmfan/cacu_jun4 -> origin/xmfan/cacu_jun4 2025-08-14T21:20:15.4985870Z * [new branch] xmfan/cacu_may27 -> origin/xmfan/cacu_may27 2025-08-14T21:20:15.4987743Z * [new branch] xmfan/circular_dep -> origin/xmfan/circular_dep 2025-08-14T21:20:15.4989613Z * [new branch] xmfan/compiled_autograd_feb_29 -> origin/xmfan/compiled_autograd_feb_29 2025-08-14T21:20:15.4991600Z * [new branch] xmfan/compiled_autograd_graph_breaks -> origin/xmfan/compiled_autograd_graph_breaks 2025-08-14T21:20:15.4993302Z * [new branch] xmfan/disable_duck_shape -> origin/xmfan/disable_duck_shape 2025-08-14T21:20:15.4995178Z * [new branch] xmfan/fca_cpp_node_passthrough -> origin/xmfan/fca_cpp_node_passthrough 2025-08-14T21:20:15.4997087Z * [new branch] xmfan/issue_123374 -> origin/xmfan/issue_123374 2025-08-14T21:20:15.4999635Z * [new branch] xmfan/post_3945954741e2d37023c5d6954f9483008e0892f9 -> origin/xmfan/post_3945954741e2d37023c5d6954f9483008e0892f9 2025-08-14T21:20:15.5001452Z * [new branch] xmfan/pre_3945954741e2d37023c5d6954f9483008e0892f9 -> origin/xmfan/pre_3945954741e2d37023c5d6954f9483008e0892f9 2025-08-14T21:20:15.5003097Z * [new branch] xmfan/segfault_test -> origin/xmfan/segfault_test 2025-08-14T21:20:15.5004988Z * [new branch] xmfan/single_step -> origin/xmfan/single_step 2025-08-14T21:20:15.5006834Z * [new branch] xmfan/sth_0829 -> origin/xmfan/sth_0829 2025-08-14T21:20:15.5008771Z * [new branch] xmfan/test -> origin/xmfan/test 2025-08-14T21:20:15.5010745Z * [new branch] y-do-we-have-7-build-systems -> origin/y-do-we-have-7-build-systems 2025-08-14T21:20:15.5013322Z * [new branch] yguo/debug-0226-constexpr -> origin/yguo/debug-0226-constexpr 2025-08-14T21:20:15.5015001Z * [new branch] yguo/new_latest_changes -> origin/yguo/new_latest_changes 2025-08-14T21:20:15.5016771Z * [new branch] yguo/patch_constexpr_changes -> origin/yguo/patch_constexpr_changes 2025-08-14T21:20:15.5018632Z * [new branch] yihan_quantization -> origin/yihan_quantization 2025-08-14T21:20:15.5021174Z * [new branch] yiming/add_nativert_benchmark -> origin/yiming/add_nativert_benchmark 2025-08-14T21:20:15.5022857Z * [new branch] yiming/bootcamp -> origin/yiming/bootcamp 2025-08-14T21:20:15.5025463Z * [new branch] zainr/canary-test -> origin/zainr/canary-test 2025-08-14T21:20:15.5027405Z * [new branch] zainr/cleanup-gh-runners -> origin/zainr/cleanup-gh-runners 2025-08-14T21:20:15.5029192Z * [new branch] zainr/fixlint -> origin/zainr/fixlint 2025-08-14T21:20:15.5030898Z * [new branch] zainr/git-push-v2 -> origin/zainr/git-push-v2 2025-08-14T21:20:15.5032579Z * [new branch] zainr/lint-py3.9 -> origin/zainr/lint-py3.9 2025-08-14T21:20:15.5034294Z * [new branch] zainr/mypy15-claude -> origin/zainr/mypy15-claude 2025-08-14T21:20:15.5036017Z * [new branch] zainr/pre-push-hooks -> origin/zainr/pre-push-hooks 2025-08-14T21:20:15.5038028Z * [new branch] zainr/pull-migration-c -> origin/zainr/pull-migration-c 2025-08-14T21:20:15.5040045Z * [new branch] zainr/test2 -> origin/zainr/test2 2025-08-14T21:20:15.5042317Z * [new branch] zainr/unstable -> origin/zainr/unstable 2025-08-14T21:20:15.5044801Z * [new branch] zainr/unstable-xla -> origin/zainr/unstable-xla 2025-08-14T21:20:15.5046724Z * [new branch] zainr/uv-pip-fix -> origin/zainr/uv-pip-fix 2025-08-14T21:20:15.5049053Z * [new branch] zainr/vs-aarch64 -> origin/zainr/vs-aarch64 2025-08-14T21:20:15.5051205Z * [new branch] zasdfgbnm-patch-3 -> origin/zasdfgbnm-patch-3 2025-08-14T21:20:15.5053149Z * [new branch] zb2p -> origin/zb2p 2025-08-14T21:20:15.5055163Z * [new branch] zdevito-patch-1 -> origin/zdevito-patch-1 2025-08-14T21:20:15.5057204Z * [new branch] zeros-and-scatter-part2 -> origin/zeros-and-scatter-part2 2025-08-14T21:20:15.5060331Z * [new branch] zhxchen17/nativert/0 -> origin/zhxchen17/nativert/0 2025-08-14T21:20:15.5062771Z * [new branch] zhxchen17/scratch/0 -> origin/zhxchen17/scratch/0 2025-08-14T21:20:15.5065338Z * [new branch] zhxhcen17/moodycamel -> origin/zhxhcen17/moodycamel 2025-08-14T21:20:15.5067931Z * [new branch] zxiiro/bazel -> origin/zxiiro/bazel 2025-08-14T21:20:15.5069896Z * [new branch] zxiiro/get-hardware -> origin/zxiiro/get-hardware 2025-08-14T21:20:15.5071674Z * [new branch] zxiiro/main -> origin/zxiiro/main 2025-08-14T21:20:15.5073329Z * [new branch] zxiiro/test -> origin/zxiiro/test 2025-08-14T21:20:15.5075067Z * [new tag] bc2caa7fdf006894eff7af936babde69ab5a40f8-huydhn-debug -> bc2caa7fdf006894eff7af936babde69ab5a40f8-huydhn-debug 2025-08-14T21:20:15.5076711Z * [new tag] ci/binaries/77164 -> ci/binaries/77164 2025-08-14T21:20:15.5078656Z * [new tag] ciflow/binaries/138996 -> ciflow/binaries/138996 2025-08-14T21:20:15.5079977Z * [new tag] ciflow/binaries/143959 -> ciflow/binaries/143959 2025-08-14T21:20:15.5081315Z * [new tag] ciflow/binaries/154595 -> ciflow/binaries/154595 2025-08-14T21:20:15.5082537Z * [new tag] ciflow/binaries/156049 -> ciflow/binaries/156049 2025-08-14T21:20:15.5083788Z * [new tag] ciflow/binaries/156712 -> ciflow/binaries/156712 2025-08-14T21:20:15.5085093Z * [new tag] ciflow/binaries/157432 -> ciflow/binaries/157432 2025-08-14T21:20:15.5086375Z * [new tag] ciflow/binaries/157685 -> ciflow/binaries/157685 2025-08-14T21:20:15.5087454Z * [new tag] ciflow/binaries/157689 -> ciflow/binaries/157689 2025-08-14T21:20:15.5088860Z * [new tag] ciflow/binaries/158104 -> ciflow/binaries/158104 2025-08-14T21:20:15.5090657Z * [new tag] ciflow/binaries/158623 -> ciflow/binaries/158623 2025-08-14T21:20:15.5091194Z * [new tag] ciflow/binaries/159827 -> ciflow/binaries/159827 2025-08-14T21:20:15.5093060Z * [new tag] ciflow/binaries/159869 -> ciflow/binaries/159869 2025-08-14T21:20:15.5094810Z * [new tag] ciflow/binaries/160593 -> ciflow/binaries/160593 2025-08-14T21:20:15.5096349Z * [new tag] ciflow/binaries_libtorch/143959 -> ciflow/binaries_libtorch/143959 2025-08-14T21:20:15.5097666Z * [new tag] ciflow/binaries_libtorch/156049 -> ciflow/binaries_libtorch/156049 2025-08-14T21:20:15.5098734Z * [new tag] ciflow/binaries_libtorch/157432 -> ciflow/binaries_libtorch/157432 2025-08-14T21:20:15.5100248Z * [new tag] ciflow/binaries_wheel/143959 -> ciflow/binaries_wheel/143959 2025-08-14T21:20:15.5101351Z * [new tag] ciflow/binaries_wheel/156049 -> ciflow/binaries_wheel/156049 2025-08-14T21:20:15.5102665Z * [new tag] ciflow/binaries_wheel/157432 -> ciflow/binaries_wheel/157432 2025-08-14T21:20:15.5103964Z * [new tag] ciflow/binaries_wheel/158733 -> ciflow/binaries_wheel/158733 2025-08-14T21:20:15.5105061Z * [new tag] ciflow/binaries_wheel/160301 -> ciflow/binaries_wheel/160301 2025-08-14T21:20:15.5106469Z * [new tag] ciflow/binaries_wheel/160496 -> ciflow/binaries_wheel/160496 2025-08-14T21:20:15.5108201Z * [new tag] ciflow/h100-distributed/156703 -> ciflow/h100-distributed/156703 2025-08-14T21:20:15.5109739Z * [new tag] ciflow/h100-symm-mem/151845 -> ciflow/h100-symm-mem/151845 2025-08-14T21:20:15.5110966Z * [new tag] ciflow/h100-symm-mem/155923 -> ciflow/h100-symm-mem/155923 2025-08-14T21:20:15.5112165Z * [new tag] ciflow/h100-symm-mem/157635 -> ciflow/h100-symm-mem/157635 2025-08-14T21:20:15.5113524Z * [new tag] ciflow/h100-symm-mem/159118 -> ciflow/h100-symm-mem/159118 2025-08-14T21:20:15.5114924Z * [new tag] ciflow/h100-symm-mem/159562 -> ciflow/h100-symm-mem/159562 2025-08-14T21:20:15.5116220Z * [new tag] ciflow/h100-symm-mem/159889 -> ciflow/h100-symm-mem/159889 2025-08-14T21:20:15.5117646Z * [new tag] ciflow/h100/159158 -> ciflow/h100/159158 2025-08-14T21:20:15.5119273Z * [new tag] ciflow/h100/160450 -> ciflow/h100/160450 2025-08-14T21:20:15.5120529Z * [new tag] ciflow/h100/160480 -> ciflow/h100/160480 2025-08-14T21:20:15.5122057Z * [new tag] ciflow/h100/160614 -> ciflow/h100/160614 2025-08-14T21:20:15.5123741Z * [new tag] ciflow/inductor-perf-test-nightly-rocm/151845 -> ciflow/inductor-perf-test-nightly-rocm/151845 2025-08-14T21:20:15.5125245Z * [new tag] ciflow/inductor-perf-test-nightly-rocm/160538 -> ciflow/inductor-perf-test-nightly-rocm/160538 2025-08-14T21:20:15.5127234Z * [new tag] ciflow/inductor-perf-test-nightly-x86-zen/156599 -> ciflow/inductor-perf-test-nightly-x86-zen/156599 2025-08-14T21:20:15.5128751Z * [new tag] ciflow/inductor-periodic/160406 -> ciflow/inductor-periodic/160406 2025-08-14T21:20:15.5130526Z * [new tag] ciflow/inductor-periodic/160538 -> ciflow/inductor-periodic/160538 2025-08-14T21:20:15.5131987Z * [new tag] ciflow/inductor-rocm/151845 -> ciflow/inductor-rocm/151845 2025-08-14T21:20:15.5133226Z * [new tag] ciflow/inductor-rocm/159158 -> ciflow/inductor-rocm/159158 2025-08-14T21:20:15.5134536Z * [new tag] ciflow/inductor-rocm/160073 -> ciflow/inductor-rocm/160073 2025-08-14T21:20:15.5135643Z * [new tag] ciflow/inductor-rocm/160538 -> ciflow/inductor-rocm/160538 2025-08-14T21:20:15.5137203Z * [new tag] ciflow/inductor/134881 -> ciflow/inductor/134881 2025-08-14T21:20:15.5138426Z * [new tag] ciflow/inductor/137400 -> ciflow/inductor/137400 2025-08-14T21:20:15.5139635Z * [new tag] ciflow/inductor/144516 -> ciflow/inductor/144516 2025-08-14T21:20:15.5140843Z * [new tag] ciflow/inductor/146506 -> ciflow/inductor/146506 2025-08-14T21:20:15.5142049Z * [new tag] ciflow/inductor/147360 -> ciflow/inductor/147360 2025-08-14T21:20:15.5143266Z * [new tag] ciflow/inductor/147990 -> ciflow/inductor/147990 2025-08-14T21:20:15.5144470Z * [new tag] ciflow/inductor/148180 -> ciflow/inductor/148180 2025-08-14T21:20:15.5145650Z * [new tag] ciflow/inductor/148328 -> ciflow/inductor/148328 2025-08-14T21:20:15.5146947Z * [new tag] ciflow/inductor/148484 -> ciflow/inductor/148484 2025-08-14T21:20:15.5148363Z * [new tag] ciflow/inductor/148492 -> ciflow/inductor/148492 2025-08-14T21:20:15.5149545Z * [new tag] ciflow/inductor/150302 -> ciflow/inductor/150302 2025-08-14T21:20:15.5151112Z * [new tag] ciflow/inductor/151845 -> ciflow/inductor/151845 2025-08-14T21:20:15.5152826Z * [new tag] ciflow/inductor/152198 -> ciflow/inductor/152198 2025-08-14T21:20:15.5154371Z * [new tag] ciflow/inductor/152624 -> ciflow/inductor/152624 2025-08-14T21:20:15.5155458Z * [new tag] ciflow/inductor/153966 -> ciflow/inductor/153966 2025-08-14T21:20:15.5156873Z * [new tag] ciflow/inductor/154193 -> ciflow/inductor/154193 2025-08-14T21:20:15.5158338Z * [new tag] ciflow/inductor/154650 -> ciflow/inductor/154650 2025-08-14T21:20:15.5159613Z * [new tag] ciflow/inductor/154694 -> ciflow/inductor/154694 2025-08-14T21:20:15.5160867Z * [new tag] ciflow/inductor/155072 -> ciflow/inductor/155072 2025-08-14T21:20:15.5162139Z * [new tag] ciflow/inductor/155152 -> ciflow/inductor/155152 2025-08-14T21:20:15.5163435Z * [new tag] ciflow/inductor/155153 -> ciflow/inductor/155153 2025-08-14T21:20:15.5164780Z * [new tag] ciflow/inductor/155154 -> ciflow/inductor/155154 2025-08-14T21:20:15.5166130Z * [new tag] ciflow/inductor/155501 -> ciflow/inductor/155501 2025-08-14T21:20:15.5167553Z * [new tag] ciflow/inductor/155502 -> ciflow/inductor/155502 2025-08-14T21:20:15.5168583Z * [new tag] ciflow/inductor/155503 -> ciflow/inductor/155503 2025-08-14T21:20:15.5170235Z * [new tag] ciflow/inductor/155504 -> ciflow/inductor/155504 2025-08-14T21:20:15.5171468Z * [new tag] ciflow/inductor/155557 -> ciflow/inductor/155557 2025-08-14T21:20:15.5172740Z * [new tag] ciflow/inductor/155608 -> ciflow/inductor/155608 2025-08-14T21:20:15.5174018Z * [new tag] ciflow/inductor/155923 -> ciflow/inductor/155923 2025-08-14T21:20:15.5175292Z * [new tag] ciflow/inductor/155928 -> ciflow/inductor/155928 2025-08-14T21:20:15.5176738Z * [new tag] ciflow/inductor/155958 -> ciflow/inductor/155958 2025-08-14T21:20:15.5178520Z * [new tag] ciflow/inductor/156049 -> ciflow/inductor/156049 2025-08-14T21:20:15.5179836Z * [new tag] ciflow/inductor/156851 -> ciflow/inductor/156851 2025-08-14T21:20:15.5181112Z * [new tag] ciflow/inductor/156967 -> ciflow/inductor/156967 2025-08-14T21:20:15.5182450Z * [new tag] ciflow/inductor/157148 -> ciflow/inductor/157148 2025-08-14T21:20:15.5183710Z * [new tag] ciflow/inductor/157149 -> ciflow/inductor/157149 2025-08-14T21:20:15.5185032Z * [new tag] ciflow/inductor/157152 -> ciflow/inductor/157152 2025-08-14T21:20:15.5186308Z * [new tag] ciflow/inductor/157542 -> ciflow/inductor/157542 2025-08-14T21:20:15.5187569Z * [new tag] ciflow/inductor/157572 -> ciflow/inductor/157572 2025-08-14T21:20:15.5188845Z * [new tag] ciflow/inductor/157635 -> ciflow/inductor/157635 2025-08-14T21:20:15.5190405Z * [new tag] ciflow/inductor/157685 -> ciflow/inductor/157685 2025-08-14T21:20:15.5191724Z * [new tag] ciflow/inductor/157686 -> ciflow/inductor/157686 2025-08-14T21:20:15.5192985Z * [new tag] ciflow/inductor/157689 -> ciflow/inductor/157689 2025-08-14T21:20:15.5194255Z * [new tag] ciflow/inductor/157699 -> ciflow/inductor/157699 2025-08-14T21:20:15.5195766Z * [new tag] ciflow/inductor/157743 -> ciflow/inductor/157743 2025-08-14T21:20:15.5197128Z * [new tag] ciflow/inductor/157944 -> ciflow/inductor/157944 2025-08-14T21:20:15.5198411Z * [new tag] ciflow/inductor/157971 -> ciflow/inductor/157971 2025-08-14T21:20:15.5199809Z * [new tag] ciflow/inductor/157994 -> ciflow/inductor/157994 2025-08-14T21:20:15.5201182Z * [new tag] ciflow/inductor/158061 -> ciflow/inductor/158061 2025-08-14T21:20:15.5202486Z * [new tag] ciflow/inductor/158091 -> ciflow/inductor/158091 2025-08-14T21:20:15.5203689Z * [new tag] ciflow/inductor/158097 -> ciflow/inductor/158097 2025-08-14T21:20:15.5205132Z * [new tag] ciflow/inductor/158098 -> ciflow/inductor/158098 2025-08-14T21:20:15.5206520Z * [new tag] ciflow/inductor/158104 -> ciflow/inductor/158104 2025-08-14T21:20:15.5207817Z * [new tag] ciflow/inductor/158168 -> ciflow/inductor/158168 2025-08-14T21:20:15.5209229Z * [new tag] ciflow/inductor/158250 -> ciflow/inductor/158250 2025-08-14T21:20:15.5210523Z * [new tag] ciflow/inductor/158321 -> ciflow/inductor/158321 2025-08-14T21:20:15.5211806Z * [new tag] ciflow/inductor/158609 -> ciflow/inductor/158609 2025-08-14T21:20:15.5213070Z * [new tag] ciflow/inductor/158647 -> ciflow/inductor/158647 2025-08-14T21:20:15.5214347Z * [new tag] ciflow/inductor/158914 -> ciflow/inductor/158914 2025-08-14T21:20:15.5215960Z * [new tag] ciflow/inductor/158932 -> ciflow/inductor/158932 2025-08-14T21:20:15.5217181Z * [new tag] ciflow/inductor/158987 -> ciflow/inductor/158987 2025-08-14T21:20:15.5218421Z * [new tag] ciflow/inductor/159009 -> ciflow/inductor/159009 2025-08-14T21:20:15.5219696Z * [new tag] ciflow/inductor/159010 -> ciflow/inductor/159010 2025-08-14T21:20:15.5221010Z * [new tag] ciflow/inductor/159093 -> ciflow/inductor/159093 2025-08-14T21:20:15.5222292Z * [new tag] ciflow/inductor/159158 -> ciflow/inductor/159158 2025-08-14T21:20:15.5223585Z * [new tag] ciflow/inductor/159197 -> ciflow/inductor/159197 2025-08-14T21:20:15.5225034Z * [new tag] ciflow/inductor/159274 -> ciflow/inductor/159274 2025-08-14T21:20:15.5226327Z * [new tag] ciflow/inductor/159281 -> ciflow/inductor/159281 2025-08-14T21:20:15.5227673Z * [new tag] ciflow/inductor/159329 -> ciflow/inductor/159329 2025-08-14T21:20:15.5228906Z * [new tag] ciflow/inductor/159361 -> ciflow/inductor/159361 2025-08-14T21:20:15.5230189Z * [new tag] ciflow/inductor/159365 -> ciflow/inductor/159365 2025-08-14T21:20:15.5231470Z * [new tag] ciflow/inductor/159366 -> ciflow/inductor/159366 2025-08-14T21:20:15.5232800Z * [new tag] ciflow/inductor/159367 -> ciflow/inductor/159367 2025-08-14T21:20:15.5234055Z * [new tag] ciflow/inductor/159368 -> ciflow/inductor/159368 2025-08-14T21:20:15.5235472Z * [new tag] ciflow/inductor/159473 -> ciflow/inductor/159473 2025-08-14T21:20:15.5236809Z * [new tag] ciflow/inductor/159483 -> ciflow/inductor/159483 2025-08-14T21:20:15.5238210Z * [new tag] ciflow/inductor/159508 -> ciflow/inductor/159508 2025-08-14T21:20:15.5239470Z * [new tag] ciflow/inductor/159523 -> ciflow/inductor/159523 2025-08-14T21:20:15.5240762Z * [new tag] ciflow/inductor/159678 -> ciflow/inductor/159678 2025-08-14T21:20:15.5242048Z * [new tag] ciflow/inductor/159691 -> ciflow/inductor/159691 2025-08-14T21:20:15.5243503Z * [new tag] ciflow/inductor/159778 -> ciflow/inductor/159778 2025-08-14T21:20:15.5244886Z * [new tag] ciflow/inductor/159786 -> ciflow/inductor/159786 2025-08-14T21:20:15.5246248Z * [new tag] ciflow/inductor/159817 -> ciflow/inductor/159817 2025-08-14T21:20:15.5247806Z * [new tag] ciflow/inductor/159842 -> ciflow/inductor/159842 2025-08-14T21:20:15.5249272Z * [new tag] ciflow/inductor/159864 -> ciflow/inductor/159864 2025-08-14T21:20:15.5250558Z * [new tag] ciflow/inductor/159865 -> ciflow/inductor/159865 2025-08-14T21:20:15.5251824Z * [new tag] ciflow/inductor/159869 -> ciflow/inductor/159869 2025-08-14T21:20:15.5253022Z * [new tag] ciflow/inductor/159875 -> ciflow/inductor/159875 2025-08-14T21:20:15.5254378Z * [new tag] ciflow/inductor/159889 -> ciflow/inductor/159889 2025-08-14T21:20:15.5255671Z * [new tag] ciflow/inductor/159902 -> ciflow/inductor/159902 2025-08-14T21:20:15.5257093Z * [new tag] ciflow/inductor/159923 -> ciflow/inductor/159923 2025-08-14T21:20:15.5258507Z * [new tag] ciflow/inductor/159944 -> ciflow/inductor/159944 2025-08-14T21:20:15.5259875Z * [new tag] ciflow/inductor/160004 -> ciflow/inductor/160004 2025-08-14T21:20:15.5261173Z * [new tag] ciflow/inductor/160080 -> ciflow/inductor/160080 2025-08-14T21:20:15.5262476Z * [new tag] ciflow/inductor/160108 -> ciflow/inductor/160108 2025-08-14T21:20:15.5264276Z * [new tag] ciflow/inductor/160109 -> ciflow/inductor/160109 2025-08-14T21:20:15.5265976Z * [new tag] ciflow/inductor/160111 -> ciflow/inductor/160111 2025-08-14T21:20:15.5266922Z * [new tag] ciflow/inductor/160113 -> ciflow/inductor/160113 2025-08-14T21:20:15.5268376Z * [new tag] ciflow/inductor/160127 -> ciflow/inductor/160127 2025-08-14T21:20:15.5269664Z * [new tag] ciflow/inductor/160131 -> ciflow/inductor/160131 2025-08-14T21:20:15.5270971Z * [new tag] ciflow/inductor/160132 -> ciflow/inductor/160132 2025-08-14T21:20:15.5272232Z * [new tag] ciflow/inductor/160136 -> ciflow/inductor/160136 2025-08-14T21:20:15.5273500Z * [new tag] ciflow/inductor/160138 -> ciflow/inductor/160138 2025-08-14T21:20:15.5274837Z * [new tag] ciflow/inductor/160151 -> ciflow/inductor/160151 2025-08-14T21:20:15.5276119Z * [new tag] ciflow/inductor/160152 -> ciflow/inductor/160152 2025-08-14T21:20:15.5277453Z * [new tag] ciflow/inductor/160154 -> ciflow/inductor/160154 2025-08-14T21:20:15.5278744Z * [new tag] ciflow/inductor/160156 -> ciflow/inductor/160156 2025-08-14T21:20:15.5280166Z * [new tag] ciflow/inductor/160161 -> ciflow/inductor/160161 2025-08-14T21:20:15.5281706Z * [new tag] ciflow/inductor/160166 -> ciflow/inductor/160166 2025-08-14T21:20:15.5283036Z * [new tag] ciflow/inductor/160168 -> ciflow/inductor/160168 2025-08-14T21:20:15.5284319Z * [new tag] ciflow/inductor/160174 -> ciflow/inductor/160174 2025-08-14T21:20:15.5285747Z * [new tag] ciflow/inductor/160181 -> ciflow/inductor/160181 2025-08-14T21:20:15.5287170Z * [new tag] ciflow/inductor/160183 -> ciflow/inductor/160183 2025-08-14T21:20:15.5288674Z * [new tag] ciflow/inductor/160190 -> ciflow/inductor/160190 2025-08-14T21:20:15.5290198Z * [new tag] ciflow/inductor/160198 -> ciflow/inductor/160198 2025-08-14T21:20:15.5291555Z * [new tag] ciflow/inductor/160201 -> ciflow/inductor/160201 2025-08-14T21:20:15.5292862Z * [new tag] ciflow/inductor/160209 -> ciflow/inductor/160209 2025-08-14T21:20:15.5294354Z * [new tag] ciflow/inductor/160218 -> ciflow/inductor/160218 2025-08-14T21:20:15.5295623Z * [new tag] ciflow/inductor/160239 -> ciflow/inductor/160239 2025-08-14T21:20:15.5297001Z * [new tag] ciflow/inductor/160250 -> ciflow/inductor/160250 2025-08-14T21:20:15.5298352Z * [new tag] ciflow/inductor/160253 -> ciflow/inductor/160253 2025-08-14T21:20:15.5299609Z * [new tag] ciflow/inductor/160266 -> ciflow/inductor/160266 2025-08-14T21:20:15.5300892Z * [new tag] ciflow/inductor/160282 -> ciflow/inductor/160282 2025-08-14T21:20:15.5302255Z * [new tag] ciflow/inductor/160298 -> ciflow/inductor/160298 2025-08-14T21:20:15.5303482Z * [new tag] ciflow/inductor/160301 -> ciflow/inductor/160301 2025-08-14T21:20:15.5304869Z * [new tag] ciflow/inductor/160310 -> ciflow/inductor/160310 2025-08-14T21:20:15.5306312Z * [new tag] ciflow/inductor/160323 -> ciflow/inductor/160323 2025-08-14T21:20:15.5307958Z * [new tag] ciflow/inductor/160324 -> ciflow/inductor/160324 2025-08-14T21:20:15.5309487Z * [new tag] ciflow/inductor/160325 -> ciflow/inductor/160325 2025-08-14T21:20:15.5310947Z * [new tag] ciflow/inductor/160326 -> ciflow/inductor/160326 2025-08-14T21:20:15.5312343Z * [new tag] ciflow/inductor/160327 -> ciflow/inductor/160327 2025-08-14T21:20:15.5313756Z * [new tag] ciflow/inductor/160328 -> ciflow/inductor/160328 2025-08-14T21:20:15.5322195Z * [new tag] ciflow/inductor/160329 -> ciflow/inductor/160329 2025-08-14T21:20:15.5322523Z * [new tag] ciflow/inductor/160351 -> ciflow/inductor/160351 2025-08-14T21:20:15.5322718Z * [new tag] ciflow/inductor/160353 -> ciflow/inductor/160353 2025-08-14T21:20:15.5322898Z * [new tag] ciflow/inductor/160362 -> ciflow/inductor/160362 2025-08-14T21:20:15.5323072Z * [new tag] ciflow/inductor/160363 -> ciflow/inductor/160363 2025-08-14T21:20:15.5323241Z * [new tag] ciflow/inductor/160364 -> ciflow/inductor/160364 2025-08-14T21:20:15.5323423Z * [new tag] ciflow/inductor/160365 -> ciflow/inductor/160365 2025-08-14T21:20:15.5324744Z * [new tag] ciflow/inductor/160366 -> ciflow/inductor/160366 2025-08-14T21:20:15.5326238Z * [new tag] ciflow/inductor/160367 -> ciflow/inductor/160367 2025-08-14T21:20:15.5327512Z * [new tag] ciflow/inductor/160368 -> ciflow/inductor/160368 2025-08-14T21:20:15.5328838Z * [new tag] ciflow/inductor/160369 -> ciflow/inductor/160369 2025-08-14T21:20:15.5330117Z * [new tag] ciflow/inductor/160371 -> ciflow/inductor/160371 2025-08-14T21:20:15.5331369Z * [new tag] ciflow/inductor/160374 -> ciflow/inductor/160374 2025-08-14T21:20:15.5332655Z * [new tag] ciflow/inductor/160375 -> ciflow/inductor/160375 2025-08-14T21:20:15.5333949Z * [new tag] ciflow/inductor/160377 -> ciflow/inductor/160377 2025-08-14T21:20:15.5335266Z * [new tag] ciflow/inductor/160380 -> ciflow/inductor/160380 2025-08-14T21:20:15.5336821Z * [new tag] ciflow/inductor/160381 -> ciflow/inductor/160381 2025-08-14T21:20:15.5338368Z * [new tag] ciflow/inductor/160383 -> ciflow/inductor/160383 2025-08-14T21:20:15.5339875Z * [new tag] ciflow/inductor/160394 -> ciflow/inductor/160394 2025-08-14T21:20:15.5341181Z * [new tag] ciflow/inductor/160401 -> ciflow/inductor/160401 2025-08-14T21:20:15.5342480Z * [new tag] ciflow/inductor/160402 -> ciflow/inductor/160402 2025-08-14T21:20:15.5343761Z * [new tag] ciflow/inductor/160403 -> ciflow/inductor/160403 2025-08-14T21:20:15.5345005Z * [new tag] ciflow/inductor/160424 -> ciflow/inductor/160424 2025-08-14T21:20:15.5346302Z * [new tag] ciflow/inductor/160426 -> ciflow/inductor/160426 2025-08-14T21:20:15.5347996Z * [new tag] ciflow/inductor/160431 -> ciflow/inductor/160431 2025-08-14T21:20:15.5349288Z * [new tag] ciflow/inductor/160448 -> ciflow/inductor/160448 2025-08-14T21:20:15.5350657Z * [new tag] ciflow/inductor/160450 -> ciflow/inductor/160450 2025-08-14T21:20:15.5352502Z * [new tag] ciflow/inductor/160455 -> ciflow/inductor/160455 2025-08-14T21:20:15.5354054Z * [new tag] ciflow/inductor/160456 -> ciflow/inductor/160456 2025-08-14T21:20:15.5355475Z * [new tag] ciflow/inductor/160461 -> ciflow/inductor/160461 2025-08-14T21:20:15.5356773Z * [new tag] ciflow/inductor/160462 -> ciflow/inductor/160462 2025-08-14T21:20:15.5358016Z * [new tag] ciflow/inductor/160467 -> ciflow/inductor/160467 2025-08-14T21:20:15.5359339Z * [new tag] ciflow/inductor/160470 -> ciflow/inductor/160470 2025-08-14T21:20:15.5360684Z * [new tag] ciflow/inductor/160473 -> ciflow/inductor/160473 2025-08-14T21:20:15.5362018Z * [new tag] ciflow/inductor/160476 -> ciflow/inductor/160476 2025-08-14T21:20:15.5363260Z * [new tag] ciflow/inductor/160480 -> ciflow/inductor/160480 2025-08-14T21:20:15.5364737Z * [new tag] ciflow/inductor/160481 -> ciflow/inductor/160481 2025-08-14T21:20:15.5366404Z * [new tag] ciflow/inductor/160482 -> ciflow/inductor/160482 2025-08-14T21:20:15.5367303Z * [new tag] ciflow/inductor/160483 -> ciflow/inductor/160483 2025-08-14T21:20:15.5368798Z * [new tag] ciflow/inductor/160485 -> ciflow/inductor/160485 2025-08-14T21:20:15.5370366Z * [new tag] ciflow/inductor/160486 -> ciflow/inductor/160486 2025-08-14T21:20:15.5371658Z * [new tag] ciflow/inductor/160503 -> ciflow/inductor/160503 2025-08-14T21:20:15.5372981Z * [new tag] ciflow/inductor/160510 -> ciflow/inductor/160510 2025-08-14T21:20:15.5374232Z * [new tag] ciflow/inductor/160527 -> ciflow/inductor/160527 2025-08-14T21:20:15.5375542Z * [new tag] ciflow/inductor/160530 -> ciflow/inductor/160530 2025-08-14T21:20:15.5376915Z * [new tag] ciflow/inductor/160531 -> ciflow/inductor/160531 2025-08-14T21:20:15.5378248Z * [new tag] ciflow/inductor/160538 -> ciflow/inductor/160538 2025-08-14T21:20:15.5380164Z * [new tag] ciflow/inductor/160539 -> ciflow/inductor/160539 2025-08-14T21:20:15.5381769Z * [new tag] ciflow/inductor/160540 -> ciflow/inductor/160540 2025-08-14T21:20:15.5383087Z * [new tag] ciflow/inductor/160548 -> ciflow/inductor/160548 2025-08-14T21:20:15.5384432Z * [new tag] ciflow/inductor/160561 -> ciflow/inductor/160561 2025-08-14T21:20:15.5385878Z * [new tag] ciflow/inductor/160576 -> ciflow/inductor/160576 2025-08-14T21:20:15.5387137Z * [new tag] ciflow/inductor/160578 -> ciflow/inductor/160578 2025-08-14T21:20:15.5388463Z * [new tag] ciflow/inductor/160580 -> ciflow/inductor/160580 2025-08-14T21:20:15.5389783Z * [new tag] ciflow/inductor/160583 -> ciflow/inductor/160583 2025-08-14T21:20:15.5391282Z * [new tag] ciflow/inductor/160589 -> ciflow/inductor/160589 2025-08-14T21:20:15.5392681Z * [new tag] ciflow/inductor/160590 -> ciflow/inductor/160590 2025-08-14T21:20:15.5394198Z * [new tag] ciflow/inductor/160592 -> ciflow/inductor/160592 2025-08-14T21:20:15.5395566Z * [new tag] ciflow/inductor/160596 -> ciflow/inductor/160596 2025-08-14T21:20:15.5396817Z * [new tag] ciflow/inductor/160601 -> ciflow/inductor/160601 2025-08-14T21:20:15.5398271Z * [new tag] ciflow/inductor/160607 -> ciflow/inductor/160607 2025-08-14T21:20:15.5399735Z * [new tag] ciflow/inductor/160608 -> ciflow/inductor/160608 2025-08-14T21:20:15.5401041Z * [new tag] ciflow/inductor/160611 -> ciflow/inductor/160611 2025-08-14T21:20:15.5402333Z * [new tag] ciflow/inductor/160614 -> ciflow/inductor/160614 2025-08-14T21:20:15.5403683Z * [new tag] ciflow/inductor/160616 -> ciflow/inductor/160616 2025-08-14T21:20:15.5405066Z * [new tag] ciflow/inductor/160619 -> ciflow/inductor/160619 2025-08-14T21:20:15.5406452Z * [new tag] ciflow/inductor/160625 -> ciflow/inductor/160625 2025-08-14T21:20:15.5407819Z * [new tag] ciflow/inductor/160635 -> ciflow/inductor/160635 2025-08-14T21:20:15.5409235Z * [new tag] ciflow/inductor/160649 -> ciflow/inductor/160649 2025-08-14T21:20:15.5410550Z * [new tag] ciflow/inductor/160658 -> ciflow/inductor/160658 2025-08-14T21:20:15.5411888Z * [new tag] ciflow/inductor/160662 -> ciflow/inductor/160662 2025-08-14T21:20:15.5413157Z * [new tag] ciflow/inductor/160668 -> ciflow/inductor/160668 2025-08-14T21:20:15.5414556Z * [new tag] ciflow/inductor/160669 -> ciflow/inductor/160669 2025-08-14T21:20:15.5415843Z * [new tag] ciflow/inductor/160670 -> ciflow/inductor/160670 2025-08-14T21:20:15.5417448Z * [new tag] ciflow/inductor/160671 -> ciflow/inductor/160671 2025-08-14T21:20:15.5418732Z * [new tag] ciflow/inductor/160677 -> ciflow/inductor/160677 2025-08-14T21:20:15.5420032Z * [new tag] ciflow/inductor/160679 -> ciflow/inductor/160679 2025-08-14T21:20:15.5421610Z * [new tag] ciflow/inductor/3b9a386 -> ciflow/inductor/3b9a386 2025-08-14T21:20:15.5423133Z * [new tag] ciflow/inductor/3d4b92b -> ciflow/inductor/3d4b92b 2025-08-14T21:20:15.5424585Z * [new tag] ciflow/inductor/d224ac7 -> ciflow/inductor/d224ac7 2025-08-14T21:20:15.5426135Z * [new tag] ciflow/linux-aarch64/147855 -> ciflow/linux-aarch64/147855 2025-08-14T21:20:15.5427370Z * [new tag] ciflow/linux-aarch64/157994 -> ciflow/linux-aarch64/157994 2025-08-14T21:20:15.5428472Z * [new tag] ciflow/linux-aarch64/159737 -> ciflow/linux-aarch64/159737 2025-08-14T21:20:15.5429733Z * [new tag] ciflow/linux-aarch64/160078 -> ciflow/linux-aarch64/160078 2025-08-14T21:20:15.5430833Z * [new tag] ciflow/linux-aarch64/160299 -> ciflow/linux-aarch64/160299 2025-08-14T21:20:15.5432136Z * [new tag] ciflow/linux-aarch64/160301 -> ciflow/linux-aarch64/160301 2025-08-14T21:20:15.5433622Z * [new tag] ciflow/mps/155923 -> ciflow/mps/155923 2025-08-14T21:20:15.5434872Z * [new tag] ciflow/mps/157553 -> ciflow/mps/157553 2025-08-14T21:20:15.5436077Z * [new tag] ciflow/mps/157635 -> ciflow/mps/157635 2025-08-14T21:20:15.5437273Z * [new tag] ciflow/mps/160541 -> ciflow/mps/160541 2025-08-14T21:20:15.5438884Z * [new tag] ciflow/nightly/156049 -> ciflow/nightly/156049 2025-08-14T21:20:15.5440008Z * [new tag] ciflow/nightly/158104 -> ciflow/nightly/158104 2025-08-14T21:20:15.5441587Z * [new tag] ciflow/op-benchmark/157994 -> ciflow/op-benchmark/157994 2025-08-14T21:20:15.5443159Z * [new tag] ciflow/periodic-rocm-mi300/139971 -> ciflow/periodic-rocm-mi300/139971 2025-08-14T21:20:15.5444592Z * [new tag] ciflow/periodic-rocm-mi300/160073 -> ciflow/periodic-rocm-mi300/160073 2025-08-14T21:20:15.5445496Z * [new tag] ciflow/periodic-rocm-mi300/160538 -> ciflow/periodic-rocm-mi300/160538 2025-08-14T21:20:15.5447716Z * [new tag] ciflow/periodic/054a2fd -> ciflow/periodic/054a2fd 2025-08-14T21:20:15.5452977Z * [new tag] ciflow/periodic/131296 -> ciflow/periodic/131296 2025-08-14T21:20:15.5453379Z * [new tag] ciflow/periodic/139971 -> ciflow/periodic/139971 2025-08-14T21:20:15.5454366Z * [new tag] ciflow/periodic/143959 -> ciflow/periodic/143959 2025-08-14T21:20:15.5455342Z * [new tag] ciflow/periodic/154595 -> ciflow/periodic/154595 2025-08-14T21:20:15.5456933Z * [new tag] ciflow/periodic/156703 -> ciflow/periodic/156703 2025-08-14T21:20:15.5457817Z * [new tag] ciflow/periodic/160201 -> ciflow/periodic/160201 2025-08-14T21:20:15.5459407Z * [new tag] ciflow/periodic/160424 -> ciflow/periodic/160424 2025-08-14T21:20:15.5460229Z * [new tag] ciflow/periodic/160538 -> ciflow/periodic/160538 2025-08-14T21:20:15.5463210Z * [new tag] ciflow/periodic/1febab2a89302464f6c7d69cfbef7a24c421ea65 -> ciflow/periodic/1febab2a89302464f6c7d69cfbef7a24c421ea65 2025-08-14T21:20:15.5464891Z * [new tag] ciflow/periodic/2a6d37d -> ciflow/periodic/2a6d37d 2025-08-14T21:20:15.5466490Z * [new tag] ciflow/periodic/2ee22e435131369a7e4f8cc4732579acc29a941b -> ciflow/periodic/2ee22e435131369a7e4f8cc4732579acc29a941b 2025-08-14T21:20:15.5467529Z * [new tag] ciflow/periodic/317eeb8 -> ciflow/periodic/317eeb8 2025-08-14T21:20:15.5469160Z * [new tag] ciflow/periodic/3c32 -> ciflow/periodic/3c32 2025-08-14T21:20:15.5470748Z * [new tag] ciflow/periodic/3e98831 -> ciflow/periodic/3e98831 2025-08-14T21:20:15.5472409Z * [new tag] ciflow/periodic/4a773e1e867f28a8ff0b15203e5cd9548f74fcee -> ciflow/periodic/4a773e1e867f28a8ff0b15203e5cd9548f74fcee 2025-08-14T21:20:15.5473978Z * [new tag] ciflow/periodic/5f5f508aa836a46dfe88857fb223049616b94e93 -> ciflow/periodic/5f5f508aa836a46dfe88857fb223049616b94e93 2025-08-14T21:20:15.5475383Z * [new tag] ciflow/periodic/94512-point -> ciflow/periodic/94512-point 2025-08-14T21:20:15.5477227Z * [new tag] ciflow/periodic/csl/test87519 -> ciflow/periodic/csl/test87519 2025-08-14T21:20:15.5478692Z * [new tag] ciflow/periodic/csltest88275 -> ciflow/periodic/csltest88275 2025-08-14T21:20:15.5480136Z * [new tag] ciflow/periodic/csltest88761 -> ciflow/periodic/csltest88761 2025-08-14T21:20:15.5482064Z * [new tag] ciflow/periodic/d7114f05b10de8e6de81ffc567d63944c3117d51 -> ciflow/periodic/d7114f05b10de8e6de81ffc567d63944c3117d51 2025-08-14T21:20:15.5483381Z * [new tag] ciflow/periodic/release_1.12 -> ciflow/periodic/release_1.12 2025-08-14T21:20:15.5485052Z * [new tag] ciflow/periodic/release_1.12.0 -> ciflow/periodic/release_1.12.0 2025-08-14T21:20:15.5486767Z * [new tag] ciflow/periodic/sha-ec5b83 -> ciflow/periodic/sha-ec5b83 2025-08-14T21:20:15.5488436Z * [new tag] ciflow/rocm-mi300/151360 -> ciflow/rocm-mi300/151360 2025-08-14T21:20:15.5489538Z * [new tag] ciflow/rocm-mi300/159158 -> ciflow/rocm-mi300/159158 2025-08-14T21:20:15.5490833Z * [new tag] ciflow/rocm-mi300/160073 -> ciflow/rocm-mi300/160073 2025-08-14T21:20:15.5492170Z * [new tag] ciflow/rocm-mi300/160468 -> ciflow/rocm-mi300/160468 2025-08-14T21:20:15.5493127Z * [new tag] ciflow/rocm-mi300/160538 -> ciflow/rocm-mi300/160538 2025-08-14T21:20:15.5495140Z * [new tag] ciflow/rocm-mi355/160215 -> ciflow/rocm-mi355/160215 2025-08-14T21:20:15.5496661Z * [new tag] ciflow/rocm/148492 -> ciflow/rocm/148492 2025-08-14T21:20:15.5497884Z * [new tag] ciflow/rocm/151360 -> ciflow/rocm/151360 2025-08-14T21:20:15.5499109Z * [new tag] ciflow/rocm/151845 -> ciflow/rocm/151845 2025-08-14T21:20:15.5500571Z * [new tag] ciflow/rocm/154864 -> ciflow/rocm/154864 2025-08-14T21:20:15.5501909Z * [new tag] ciflow/rocm/156491 -> ciflow/rocm/156491 2025-08-14T21:20:15.5502992Z * [new tag] ciflow/rocm/158219 -> ciflow/rocm/158219 2025-08-14T21:20:15.5504324Z * [new tag] ciflow/rocm/158220 -> ciflow/rocm/158220 2025-08-14T21:20:15.5505334Z * [new tag] ciflow/rocm/158224 -> ciflow/rocm/158224 2025-08-14T21:20:15.5506746Z * [new tag] ciflow/rocm/159158 -> ciflow/rocm/159158 2025-08-14T21:20:15.5507821Z * [new tag] ciflow/rocm/160215 -> ciflow/rocm/160215 2025-08-14T21:20:15.5509130Z * [new tag] ciflow/rocm/160468 -> ciflow/rocm/160468 2025-08-14T21:20:15.5510693Z * [new tag] ciflow/rocm/160538 -> ciflow/rocm/160538 2025-08-14T21:20:15.5512217Z * [new tag] ciflow/s390/143959 -> ciflow/s390/143959 2025-08-14T21:20:15.5513954Z * [new tag] ciflow/slow/01c7106 -> ciflow/slow/01c7106 2025-08-14T21:20:15.5515313Z * [new tag] ciflow/slow/0577043 -> ciflow/slow/0577043 2025-08-14T21:20:15.5517086Z * [new tag] ciflow/slow/0d5b74da0cab798fbfdb9caa53fad816999c8386-sdym -> ciflow/slow/0d5b74da0cab798fbfdb9caa53fad816999c8386-sdym 2025-08-14T21:20:15.5518026Z * [new tag] ciflow/slow/0e81104 -> ciflow/slow/0e81104 2025-08-14T21:20:15.5519393Z * [new tag] ciflow/slow/154595 -> ciflow/slow/154595 2025-08-14T21:20:15.5520769Z * [new tag] ciflow/slow/1732077 -> ciflow/slow/1732077 2025-08-14T21:20:15.5522198Z * [new tag] ciflow/slow/187eb7c -> ciflow/slow/187eb7c 2025-08-14T21:20:15.5523590Z * [new tag] ciflow/slow/1faef89 -> ciflow/slow/1faef89 2025-08-14T21:20:15.5525494Z * [new tag] ciflow/slow/3920ec1 -> ciflow/slow/3920ec1 2025-08-14T21:20:15.5527060Z * [new tag] ciflow/slow/3b7c6b2 -> ciflow/slow/3b7c6b2 2025-08-14T21:20:15.5528582Z * [new tag] ciflow/slow/59a3759 -> ciflow/slow/59a3759 2025-08-14T21:20:15.5529972Z * [new tag] ciflow/slow/70ef0bb -> ciflow/slow/70ef0bb 2025-08-14T21:20:15.5531444Z * [new tag] ciflow/slow/788ff06 -> ciflow/slow/788ff06 2025-08-14T21:20:15.5533241Z * [new tag] ciflow/slow/8751002215790a3a88750faa8f4366933e296693-sdym -> ciflow/slow/8751002215790a3a88750faa8f4366933e296693-sdym 2025-08-14T21:20:15.5534171Z * [new tag] ciflow/slow/9d85864 -> ciflow/slow/9d85864 2025-08-14T21:20:15.5535895Z * [new tag] ciflow/slow/9ffad5b -> ciflow/slow/9ffad5b 2025-08-14T21:20:15.5537402Z * [new tag] ciflow/slow/a206e8b -> ciflow/slow/a206e8b 2025-08-14T21:20:15.5538859Z * [new tag] ciflow/slow/a837609 -> ciflow/slow/a837609 2025-08-14T21:20:15.5540325Z * [new tag] ciflow/slow/af841f3 -> ciflow/slow/af841f3 2025-08-14T21:20:15.5542185Z * [new tag] ciflow/slow/da3aba1e46157c4df504b067477cdf2b3c96b194-sdym -> ciflow/slow/da3aba1e46157c4df504b067477cdf2b3c96b194-sdym 2025-08-14T21:20:15.5543419Z * [new tag] ciflow/trunk/131296 -> ciflow/trunk/131296 2025-08-14T21:20:15.5544670Z * [new tag] ciflow/trunk/137400 -> ciflow/trunk/137400 2025-08-14T21:20:15.5546458Z * [new tag] ciflow/trunk/138996 -> ciflow/trunk/138996 2025-08-14T21:20:15.5547334Z * [new tag] ciflow/trunk/139971 -> ciflow/trunk/139971 2025-08-14T21:20:15.5549020Z * [new tag] ciflow/trunk/147360 -> ciflow/trunk/147360 2025-08-14T21:20:15.5550215Z * [new tag] ciflow/trunk/147855 -> ciflow/trunk/147855 2025-08-14T21:20:15.5551426Z * [new tag] ciflow/trunk/148180 -> ciflow/trunk/148180 2025-08-14T21:20:15.5552622Z * [new tag] ciflow/trunk/148328 -> ciflow/trunk/148328 2025-08-14T21:20:15.5553900Z * [new tag] ciflow/trunk/148492 -> ciflow/trunk/148492 2025-08-14T21:20:15.5555304Z * [new tag] ciflow/trunk/150282 -> ciflow/trunk/150282 2025-08-14T21:20:15.5556384Z * [new tag] ciflow/trunk/150302 -> ciflow/trunk/150302 2025-08-14T21:20:15.5558106Z * [new tag] ciflow/trunk/151845 -> ciflow/trunk/151845 2025-08-14T21:20:15.5559771Z * [new tag] ciflow/trunk/152624 -> ciflow/trunk/152624 2025-08-14T21:20:15.5561247Z * [new tag] ciflow/trunk/154193 -> ciflow/trunk/154193 2025-08-14T21:20:15.5562602Z * [new tag] ciflow/trunk/154595 -> ciflow/trunk/154595 2025-08-14T21:20:15.5563883Z * [new tag] ciflow/trunk/154650 -> ciflow/trunk/154650 2025-08-14T21:20:15.5565325Z * [new tag] ciflow/trunk/154694 -> ciflow/trunk/154694 2025-08-14T21:20:15.5566549Z * [new tag] ciflow/trunk/155958 -> ciflow/trunk/155958 2025-08-14T21:20:15.5567813Z * [new tag] ciflow/trunk/156049 -> ciflow/trunk/156049 2025-08-14T21:20:15.5569221Z * [new tag] ciflow/trunk/156703 -> ciflow/trunk/156703 2025-08-14T21:20:15.5570396Z * [new tag] ciflow/trunk/156851 -> ciflow/trunk/156851 2025-08-14T21:20:15.5571654Z * [new tag] ciflow/trunk/157148 -> ciflow/trunk/157148 2025-08-14T21:20:15.5572899Z * [new tag] ciflow/trunk/157152 -> ciflow/trunk/157152 2025-08-14T21:20:15.5574167Z * [new tag] ciflow/trunk/157432 -> ciflow/trunk/157432 2025-08-14T21:20:15.5575484Z * [new tag] ciflow/trunk/157685 -> ciflow/trunk/157685 2025-08-14T21:20:15.5576771Z * [new tag] ciflow/trunk/157689 -> ciflow/trunk/157689 2025-08-14T21:20:15.5578152Z * [new tag] ciflow/trunk/157699 -> ciflow/trunk/157699 2025-08-14T21:20:15.5579482Z * [new tag] ciflow/trunk/157813 -> ciflow/trunk/157813 2025-08-14T21:20:15.5580728Z * [new tag] ciflow/trunk/157994 -> ciflow/trunk/157994 2025-08-14T21:20:15.5581995Z * [new tag] ciflow/trunk/158091 -> ciflow/trunk/158091 2025-08-14T21:20:15.5583297Z * [new tag] ciflow/trunk/158104 -> ciflow/trunk/158104 2025-08-14T21:20:15.5584532Z * [new tag] ciflow/trunk/158219 -> ciflow/trunk/158219 2025-08-14T21:20:15.5585838Z * [new tag] ciflow/trunk/158220 -> ciflow/trunk/158220 2025-08-14T21:20:15.5587076Z * [new tag] ciflow/trunk/158224 -> ciflow/trunk/158224 2025-08-14T21:20:15.5588506Z * [new tag] ciflow/trunk/158529 -> ciflow/trunk/158529 2025-08-14T21:20:15.5589781Z * [new tag] ciflow/trunk/158647 -> ciflow/trunk/158647 2025-08-14T21:20:15.5591060Z * [new tag] ciflow/trunk/158810 -> ciflow/trunk/158810 2025-08-14T21:20:15.5592315Z * [new tag] ciflow/trunk/158812 -> ciflow/trunk/158812 2025-08-14T21:20:15.5593600Z * [new tag] ciflow/trunk/158863 -> ciflow/trunk/158863 2025-08-14T21:20:15.5594893Z * [new tag] ciflow/trunk/158864 -> ciflow/trunk/158864 2025-08-14T21:20:15.5596398Z * [new tag] ciflow/trunk/158883 -> ciflow/trunk/158883 2025-08-14T21:20:15.5597322Z * [new tag] ciflow/trunk/158914 -> ciflow/trunk/158914 2025-08-14T21:20:15.5598754Z * [new tag] ciflow/trunk/158965 -> ciflow/trunk/158965 2025-08-14T21:20:15.5600026Z * [new tag] ciflow/trunk/158987 -> ciflow/trunk/158987 2025-08-14T21:20:15.5601513Z * [new tag] ciflow/trunk/159033 -> ciflow/trunk/159033 2025-08-14T21:20:15.5602879Z * [new tag] ciflow/trunk/159140 -> ciflow/trunk/159140 2025-08-14T21:20:15.5604156Z * [new tag] ciflow/trunk/159158 -> ciflow/trunk/159158 2025-08-14T21:20:15.5605573Z * [new tag] ciflow/trunk/159553 -> ciflow/trunk/159553 2025-08-14T21:20:15.5606889Z * [new tag] ciflow/trunk/159562 -> ciflow/trunk/159562 2025-08-14T21:20:15.5608369Z * [new tag] ciflow/trunk/159682 -> ciflow/trunk/159682 2025-08-14T21:20:15.5610115Z * [new tag] ciflow/trunk/159691 -> ciflow/trunk/159691 2025-08-14T21:20:15.5610626Z * [new tag] ciflow/trunk/159842 -> ciflow/trunk/159842 2025-08-14T21:20:15.5612214Z * [new tag] ciflow/trunk/159889 -> ciflow/trunk/159889 2025-08-14T21:20:15.5613537Z * [new tag] ciflow/trunk/159923 -> ciflow/trunk/159923 2025-08-14T21:20:15.5614788Z * [new tag] ciflow/trunk/160004 -> ciflow/trunk/160004 2025-08-14T21:20:15.5616173Z * [new tag] ciflow/trunk/160113 -> ciflow/trunk/160113 2025-08-14T21:20:15.5617453Z * [new tag] ciflow/trunk/160161 -> ciflow/trunk/160161 2025-08-14T21:20:15.5618832Z * [new tag] ciflow/trunk/160168 -> ciflow/trunk/160168 2025-08-14T21:20:15.5620151Z * [new tag] ciflow/trunk/160181 -> ciflow/trunk/160181 2025-08-14T21:20:15.5621416Z * [new tag] ciflow/trunk/160183 -> ciflow/trunk/160183 2025-08-14T21:20:15.5622664Z * [new tag] ciflow/trunk/160190 -> ciflow/trunk/160190 2025-08-14T21:20:15.5623908Z * [new tag] ciflow/trunk/160198 -> ciflow/trunk/160198 2025-08-14T21:20:15.5625422Z * [new tag] ciflow/trunk/160205 -> ciflow/trunk/160205 2025-08-14T21:20:15.5626631Z * [new tag] ciflow/trunk/160219 -> ciflow/trunk/160219 2025-08-14T21:20:15.5627895Z * [new tag] ciflow/trunk/160224 -> ciflow/trunk/160224 2025-08-14T21:20:15.5629175Z * [new tag] ciflow/trunk/160250 -> ciflow/trunk/160250 2025-08-14T21:20:15.5630952Z * [new tag] ciflow/trunk/160253 -> ciflow/trunk/160253 2025-08-14T21:20:15.5632397Z * [new tag] ciflow/trunk/160335 -> ciflow/trunk/160335 2025-08-14T21:20:15.5634356Z * [new tag] ciflow/trunk/160338 -> ciflow/trunk/160338 2025-08-14T21:20:15.5635017Z * [new tag] ciflow/trunk/160383 -> ciflow/trunk/160383 2025-08-14T21:20:15.5636490Z * [new tag] ciflow/trunk/160401 -> ciflow/trunk/160401 2025-08-14T21:20:15.5637760Z * [new tag] ciflow/trunk/160403 -> ciflow/trunk/160403 2025-08-14T21:20:15.5639060Z * [new tag] ciflow/trunk/160430 -> ciflow/trunk/160430 2025-08-14T21:20:15.5640305Z * [new tag] ciflow/trunk/160431 -> ciflow/trunk/160431 2025-08-14T21:20:15.5641760Z * [new tag] ciflow/trunk/160439 -> ciflow/trunk/160439 2025-08-14T21:20:15.5643351Z * [new tag] ciflow/trunk/160449 -> ciflow/trunk/160449 2025-08-14T21:20:15.5644726Z * [new tag] ciflow/trunk/160454 -> ciflow/trunk/160454 2025-08-14T21:20:15.5646052Z * [new tag] ciflow/trunk/160468 -> ciflow/trunk/160468 2025-08-14T21:20:15.5647497Z * [new tag] ciflow/trunk/160481 -> ciflow/trunk/160481 2025-08-14T21:20:15.5648851Z * [new tag] ciflow/trunk/160485 -> ciflow/trunk/160485 2025-08-14T21:20:15.5650223Z * [new tag] ciflow/trunk/160519 -> ciflow/trunk/160519 2025-08-14T21:20:15.5651513Z * [new tag] ciflow/trunk/160527 -> ciflow/trunk/160527 2025-08-14T21:20:15.5652993Z * [new tag] ciflow/trunk/160560 -> ciflow/trunk/160560 2025-08-14T21:20:15.5654288Z * [new tag] ciflow/trunk/160578 -> ciflow/trunk/160578 2025-08-14T21:20:15.5655557Z * [new tag] ciflow/trunk/160589 -> ciflow/trunk/160589 2025-08-14T21:20:15.5656882Z * [new tag] ciflow/trunk/160592 -> ciflow/trunk/160592 2025-08-14T21:20:15.5658139Z * [new tag] ciflow/trunk/160649 -> ciflow/trunk/160649 2025-08-14T21:20:15.5659930Z * [new tag] ciflow/trunk/160656 -> ciflow/trunk/160656 2025-08-14T21:20:15.5661245Z * [new tag] ciflow/unstable/123 -> ciflow/unstable/123 2025-08-14T21:20:15.5662852Z * [new tag] ciflow/vllm/160116 -> ciflow/vllm/160116 2025-08-14T21:20:15.5664058Z * [new tag] ciflow/vllm/160583 -> ciflow/vllm/160583 2025-08-14T21:20:15.5665354Z * [new tag] ciflow/vllm/160619 -> ciflow/vllm/160619 2025-08-14T21:20:15.5666540Z * [new tag] ciflow/vllm/160625 -> ciflow/vllm/160625 2025-08-14T21:20:15.5667713Z * [new tag] ciflow/vllm/160627 -> ciflow/vllm/160627 2025-08-14T21:20:15.5669405Z * [new tag] ciflow/win-arm64/156049 -> ciflow/win-arm64/156049 2025-08-14T21:20:15.5670332Z * [new tag] ciflow/win-arm64/158104 -> ciflow/win-arm64/158104 2025-08-14T21:20:15.5671695Z * [new tag] ciflow/win-arm64/159553 -> ciflow/win-arm64/159553 2025-08-14T21:20:15.5672901Z * [new tag] ciflow/win-arm64/159562 -> ciflow/win-arm64/159562 2025-08-14T21:20:15.5674241Z * [new tag] ciflow/win-arm64/159777 -> ciflow/win-arm64/159777 2025-08-14T21:20:15.5675483Z * [new tag] ciflow/win-arm64/159780 -> ciflow/win-arm64/159780 2025-08-14T21:20:15.5676638Z * [new tag] ciflow/win-arm64/159842 -> ciflow/win-arm64/159842 2025-08-14T21:20:15.5678003Z * [new tag] ciflow/win-arm64/160250 -> ciflow/win-arm64/160250 2025-08-14T21:20:15.5678847Z * [new tag] ciflow/win-arm64/160253 -> ciflow/win-arm64/160253 2025-08-14T21:20:15.5680217Z * [new tag] ciflow/win-arm64/160454 -> ciflow/win-arm64/160454 2025-08-14T21:20:15.5681373Z * [new tag] ciflow/win-arm64/160560 -> ciflow/win-arm64/160560 2025-08-14T21:20:15.5682864Z * [new tag] ciflow/xpu/138996 -> ciflow/xpu/138996 2025-08-14T21:20:15.5684056Z * [new tag] ciflow/xpu/139971 -> ciflow/xpu/139971 2025-08-14T21:20:15.5685687Z * [new tag] ciflow/xpu/140972 -> ciflow/xpu/140972 2025-08-14T21:20:15.5686968Z * [new tag] ciflow/xpu/143553 -> ciflow/xpu/143553 2025-08-14T21:20:15.5688256Z * [new tag] ciflow/xpu/156272 -> ciflow/xpu/156272 2025-08-14T21:20:15.5689470Z * [new tag] ciflow/xpu/156812 -> ciflow/xpu/156812 2025-08-14T21:20:15.5690680Z * [new tag] ciflow/xpu/157699 -> ciflow/xpu/157699 2025-08-14T21:20:15.5691834Z * [new tag] ciflow/xpu/157994 -> ciflow/xpu/157994 2025-08-14T21:20:15.5693044Z * [new tag] ciflow/xpu/158336 -> ciflow/xpu/158336 2025-08-14T21:20:15.5694680Z * [new tag] ciflow/xpu/158733 -> ciflow/xpu/158733 2025-08-14T21:20:15.5695940Z * [new tag] ciflow/xpu/159033 -> ciflow/xpu/159033 2025-08-14T21:20:15.5697476Z * [new tag] ciflow/xpu/159118 -> ciflow/xpu/159118 2025-08-14T21:20:15.5699077Z * [new tag] ciflow/xpu/159140 -> ciflow/xpu/159140 2025-08-14T21:20:15.5700824Z * [new tag] ciflow/xpu/159241 -> ciflow/xpu/159241 2025-08-14T21:20:15.5702144Z * [new tag] ciflow/xpu/159473 -> ciflow/xpu/159473 2025-08-14T21:20:15.5703553Z * [new tag] ciflow/xpu/159474 -> ciflow/xpu/159474 2025-08-14T21:20:15.5704795Z * [new tag] ciflow/xpu/159553 -> ciflow/xpu/159553 2025-08-14T21:20:15.5706100Z * [new tag] ciflow/xpu/159944 -> ciflow/xpu/159944 2025-08-14T21:20:15.5707586Z * [new tag] ciflow/xpu/160062 -> ciflow/xpu/160062 2025-08-14T21:20:15.5708878Z * [new tag] ciflow/xpu/160067 -> ciflow/xpu/160067 2025-08-14T21:20:15.5710304Z * [new tag] ciflow/xpu/160158 -> ciflow/xpu/160158 2025-08-14T21:20:15.5711592Z * [new tag] ciflow/xpu/160173 -> ciflow/xpu/160173 2025-08-14T21:20:15.5712929Z * [new tag] ciflow/xpu/160183 -> ciflow/xpu/160183 2025-08-14T21:20:15.5714188Z * [new tag] ciflow/xpu/160301 -> ciflow/xpu/160301 2025-08-14T21:20:15.5715457Z * [new tag] ciflow/xpu/160403 -> ciflow/xpu/160403 2025-08-14T21:20:15.5716912Z * [new tag] ciflow/xpu/160606 -> ciflow/xpu/160606 2025-08-14T21:20:15.5718231Z * [new tag] cslpull75 -> cslpull75 2025-08-14T21:20:15.5719594Z * [new tag] cslpull76 -> cslpull76 2025-08-14T21:20:15.5720762Z * [new tag] cslpull77 -> cslpull77 2025-08-14T21:20:15.5722045Z * [new tag] cslpull78 -> cslpull78 2025-08-14T21:20:15.5723620Z * [new tag] cslpull79 -> cslpull79 2025-08-14T21:20:15.5725379Z * [new tag] cslpull80 -> cslpull80 2025-08-14T21:20:15.5726769Z * [new tag] cslpull81 -> cslpull81 2025-08-14T21:20:15.5728175Z * [new tag] cslpull82 -> cslpull82 2025-08-14T21:20:15.5729517Z * [new tag] cslpull83 -> cslpull83 2025-08-14T21:20:15.5730835Z * [new tag] cslpull84 -> cslpull84 2025-08-14T21:20:15.5732167Z * [new tag] cslpull85 -> cslpull85 2025-08-14T21:20:15.5733558Z * [new tag] cslpull86 -> cslpull86 2025-08-14T21:20:15.5734907Z * [new tag] cslpull87 -> cslpull87 2025-08-14T21:20:15.5736295Z * [new tag] cslpull88 -> cslpull88 2025-08-14T21:20:15.5737620Z * [new tag] cslpull89 -> cslpull89 2025-08-14T21:20:15.5738818Z * [new tag] cslpull90 -> cslpull90 2025-08-14T21:20:15.5740708Z * [new tag] cslpull91 -> cslpull91 2025-08-14T21:20:15.5742020Z * [new tag] cslpull92 -> cslpull92 2025-08-14T21:20:15.5743410Z * [new tag] flight_5 -> flight_5 2025-08-14T21:20:15.5744926Z * [new tag] flight_5.1 -> flight_5.1 2025-08-14T21:20:15.5746776Z * [new tag] flight_5.2 -> flight_5.2 2025-08-14T21:20:15.5748316Z * [new tag] flight_5.3 -> flight_5.3 2025-08-14T21:20:15.5749643Z * [new tag] forpull1 -> forpull1 2025-08-14T21:20:15.5751386Z * [new tag] malfet/tag-2ef5611 -> malfet/tag-2ef5611 2025-08-14T21:20:15.5752745Z * [new tag] malfet/tag-317b1a0 -> malfet/tag-317b1a0 2025-08-14T21:20:15.5754098Z * [new tag] malfet/tag-ec6f767 -> malfet/tag-ec6f767 2025-08-14T21:20:15.5755602Z * [new tag] nightly-binary -> nightly-binary 2025-08-14T21:20:15.5756749Z * [new tag] sqzhang_flight4_plus -> sqzhang_flight4_plus 2025-08-14T21:20:15.5758250Z * [new tag] sqzhang_flight_3 -> sqzhang_flight_3 2025-08-14T21:20:15.5760145Z * [new tag] trunk/01584d2a7d029c9749eb73678cf1dc313cc35df6 -> trunk/01584d2a7d029c9749eb73678cf1dc313cc35df6 2025-08-14T21:20:15.5761243Z * [new tag] trunk/017259f9c65b6fad55fb9597d7077e2543eaae46 -> trunk/017259f9c65b6fad55fb9597d7077e2543eaae46 2025-08-14T21:20:15.5763236Z * [new tag] trunk/01bcf9a40dea937637d2cdd530bed2652510943d -> trunk/01bcf9a40dea937637d2cdd530bed2652510943d 2025-08-14T21:20:15.5764836Z * [new tag] trunk/01f66d08d93365015f4af005a252f439c4d4013a -> trunk/01f66d08d93365015f4af005a252f439c4d4013a 2025-08-14T21:20:15.5766115Z * [new tag] trunk/03b254e49f2d4c092e6ca712e5702cf2895aa47e -> trunk/03b254e49f2d4c092e6ca712e5702cf2895aa47e 2025-08-14T21:20:15.5767528Z * [new tag] trunk/05029ad1c30865d3f7e7fd13384db9d826e563eb -> trunk/05029ad1c30865d3f7e7fd13384db9d826e563eb 2025-08-14T21:20:15.5769005Z * [new tag] trunk/05c19d1acecc01b0d2512364183058a6885b9869 -> trunk/05c19d1acecc01b0d2512364183058a6885b9869 2025-08-14T21:20:15.5770287Z * [new tag] trunk/05c417715f791875fbf28cfc3fc86142de1a3206 -> trunk/05c417715f791875fbf28cfc3fc86142de1a3206 2025-08-14T21:20:15.5771930Z * [new tag] trunk/06824f3c7268bb807a422b663047cd0900ddd126 -> trunk/06824f3c7268bb807a422b663047cd0900ddd126 2025-08-14T21:20:15.5772909Z * [new tag] trunk/077cb389746a7d61cfc018aad2ba29a8aa195610 -> trunk/077cb389746a7d61cfc018aad2ba29a8aa195610 2025-08-14T21:20:15.5774653Z * [new tag] trunk/089c4a1ba007ed4abb3e5e0eafd97b7584566057 -> trunk/089c4a1ba007ed4abb3e5e0eafd97b7584566057 2025-08-14T21:20:15.5777065Z * [new tag] trunk/09381f5dacda7bbbfa361f5df76bde5cd309adc1 -> trunk/09381f5dacda7bbbfa361f5df76bde5cd309adc1 2025-08-14T21:20:15.5777494Z * [new tag] trunk/0bd3af4fb87445f4de3a1f9b823e399c8b3cefde -> trunk/0bd3af4fb87445f4de3a1f9b823e399c8b3cefde 2025-08-14T21:20:15.5779442Z * [new tag] trunk/0d3461bac0fb5177e35152d980b301ea3a0aa2c4 -> trunk/0d3461bac0fb5177e35152d980b301ea3a0aa2c4 2025-08-14T21:20:15.5779921Z * [new tag] trunk/0d40ff3b496e68193bc16d5391fa2e3623709f81 -> trunk/0d40ff3b496e68193bc16d5391fa2e3623709f81 2025-08-14T21:20:15.5781660Z * [new tag] trunk/0d71ca2c46753bb268bfdcf815c14415c122a289 -> trunk/0d71ca2c46753bb268bfdcf815c14415c122a289 2025-08-14T21:20:15.5782934Z * [new tag] trunk/0d88593dd826544c9e7bd4aa615ef86847a78d2b -> trunk/0d88593dd826544c9e7bd4aa615ef86847a78d2b 2025-08-14T21:20:15.5784418Z * [new tag] trunk/0e3e377bd5126cfcc69d70c4d77b352d3404cc11 -> trunk/0e3e377bd5126cfcc69d70c4d77b352d3404cc11 2025-08-14T21:20:15.5786373Z * [new tag] trunk/0f3b10b8eebe68e3c75d473d499b87dfe14a2eca -> trunk/0f3b10b8eebe68e3c75d473d499b87dfe14a2eca 2025-08-14T21:20:15.5787905Z * [new tag] trunk/101276f81b4d2a8c31bfd6796b986d4c1bfdf483 -> trunk/101276f81b4d2a8c31bfd6796b986d4c1bfdf483 2025-08-14T21:20:15.5789433Z * [new tag] trunk/1028c5e2d50e121865bf98307e7c035f549a24b2 -> trunk/1028c5e2d50e121865bf98307e7c035f549a24b2 2025-08-14T21:20:15.5790853Z * [new tag] trunk/10bc36fe840cb3510fab84d2ea22663b76702f1e -> trunk/10bc36fe840cb3510fab84d2ea22663b76702f1e 2025-08-14T21:20:15.5792126Z * [new tag] trunk/10e3514c962b58cbbee994257872a626ff76d51b -> trunk/10e3514c962b58cbbee994257872a626ff76d51b 2025-08-14T21:20:15.5793647Z * [new tag] trunk/1128f4c2a822cbe34a9d966306af15097179ffe1 -> trunk/1128f4c2a822cbe34a9d966306af15097179ffe1 2025-08-14T21:20:15.5795061Z * [new tag] trunk/114a6c40434bfb9cfa5abc30e9e34d81300d743e -> trunk/114a6c40434bfb9cfa5abc30e9e34d81300d743e 2025-08-14T21:20:15.5796566Z * [new tag] trunk/118bc97b14c24ac88a4b0c0750a9e7bf93154c76 -> trunk/118bc97b14c24ac88a4b0c0750a9e7bf93154c76 2025-08-14T21:20:15.5797621Z * [new tag] trunk/1196bb1c2e4d5a7edc09f2260e3034132f0c6c91 -> trunk/1196bb1c2e4d5a7edc09f2260e3034132f0c6c91 2025-08-14T21:20:15.5799270Z * [new tag] trunk/11a3565f1872bbad9c253a127e8d4ce7a1b40ec8 -> trunk/11a3565f1872bbad9c253a127e8d4ce7a1b40ec8 2025-08-14T21:20:15.5800478Z * [new tag] trunk/15e49f61643e4c0eef420f0981609709ef55b848 -> trunk/15e49f61643e4c0eef420f0981609709ef55b848 2025-08-14T21:20:15.5802141Z * [new tag] trunk/16d15445f8bd8740095b23de4af89d757af793ca -> trunk/16d15445f8bd8740095b23de4af89d757af793ca 2025-08-14T21:20:15.5803522Z * [new tag] trunk/178515d0ff6833c8e9221482b2a650ab31e00019 -> trunk/178515d0ff6833c8e9221482b2a650ab31e00019 2025-08-14T21:20:15.5805110Z * [new tag] trunk/182efe31dbe43376e7eef7338356aaf94d5bcabe -> trunk/182efe31dbe43376e7eef7338356aaf94d5bcabe 2025-08-14T21:20:15.5806399Z * [new tag] trunk/194fcfcfbdad0add1a1b695321e31a576058f4cf -> trunk/194fcfcfbdad0add1a1b695321e31a576058f4cf 2025-08-14T21:20:15.5807954Z * [new tag] trunk/195b5c2e27eb8f21cbc8ad1e90f42db5a8cfccca -> trunk/195b5c2e27eb8f21cbc8ad1e90f42db5a8cfccca 2025-08-14T21:20:15.5809392Z * [new tag] trunk/198b5fd2d47fa3d5110ceba6827a3b18e0064014 -> trunk/198b5fd2d47fa3d5110ceba6827a3b18e0064014 2025-08-14T21:20:15.5810687Z * [new tag] trunk/199e9abb6a366bbd27c39d1da7c3123b4eea9b5a -> trunk/199e9abb6a366bbd27c39d1da7c3123b4eea9b5a 2025-08-14T21:20:15.5812152Z * [new tag] trunk/19b4283884b2d9b3a0eb364da10b1540d14ab7a7 -> trunk/19b4283884b2d9b3a0eb364da10b1540d14ab7a7 2025-08-14T21:20:15.5813727Z * [new tag] trunk/1c2587119152cec3905647a47c65d3d26619c5a8 -> trunk/1c2587119152cec3905647a47c65d3d26619c5a8 2025-08-14T21:20:15.5815115Z * [new tag] trunk/1c26c53851c212a7c90a325549a72f0571613a8c -> trunk/1c26c53851c212a7c90a325549a72f0571613a8c 2025-08-14T21:20:15.5816645Z * [new tag] trunk/1c2cba17eab2b09d87142883da2bdbdbcf018613 -> trunk/1c2cba17eab2b09d87142883da2bdbdbcf018613 2025-08-14T21:20:15.5818094Z * [new tag] trunk/1d80d697a269234b47ec7ede192faf3bb9b159e3 -> trunk/1d80d697a269234b47ec7ede192faf3bb9b159e3 2025-08-14T21:20:15.5819390Z * [new tag] trunk/1ea688f9a2602fbcde32c0302b822526ca4219dc -> trunk/1ea688f9a2602fbcde32c0302b822526ca4219dc 2025-08-14T21:20:15.5821026Z * [new tag] trunk/1f4057c11ac941fb324386ca594d0a6882185aad -> trunk/1f4057c11ac941fb324386ca594d0a6882185aad 2025-08-14T21:20:15.5822478Z * [new tag] trunk/1fc683cf17c8c673044538d10266c00f92987be2 -> trunk/1fc683cf17c8c673044538d10266c00f92987be2 2025-08-14T21:20:15.5823543Z * [new tag] trunk/1febab2a89302464f6c7d69cfbef7a24c421ea65 -> trunk/1febab2a89302464f6c7d69cfbef7a24c421ea65 2025-08-14T21:20:15.5825229Z * [new tag] trunk/206c1eef6571f906c2792d899a09136b3fce9673 -> trunk/206c1eef6571f906c2792d899a09136b3fce9673 2025-08-14T21:20:15.5826671Z * [new tag] trunk/20bdabbb3c5d6b118a94b2e045c777662563d5bb -> trunk/20bdabbb3c5d6b118a94b2e045c777662563d5bb 2025-08-14T21:20:15.5828057Z * [new tag] trunk/21392c0e06ac2b2621950455975ca6332f0bf641 -> trunk/21392c0e06ac2b2621950455975ca6332f0bf641 2025-08-14T21:20:15.5829146Z * [new tag] trunk/2247aa6d1d43e256255f5c74a781c3190a4387b6 -> trunk/2247aa6d1d43e256255f5c74a781c3190a4387b6 2025-08-14T21:20:15.5830629Z * [new tag] trunk/2259dbed4e0d3f2a8174b5847fd0741aed42451d -> trunk/2259dbed4e0d3f2a8174b5847fd0741aed42451d 2025-08-14T21:20:15.5832025Z * [new tag] trunk/231c72240d80091f099c95e326d3600cba866eee -> trunk/231c72240d80091f099c95e326d3600cba866eee 2025-08-14T21:20:15.5833080Z * [new tag] trunk/24257f5bfaa37795f74d9f64c1b43584128d4b8c -> trunk/24257f5bfaa37795f74d9f64c1b43584128d4b8c 2025-08-14T21:20:15.5835371Z * [new tag] trunk/24f43d0da7ad9c6e95a09a2fee610387728cc1cd -> trunk/24f43d0da7ad9c6e95a09a2fee610387728cc1cd 2025-08-14T21:20:15.5836655Z * [new tag] trunk/2898d3f965e5cd9d02fc2ecdab7c580fd457fea9 -> trunk/2898d3f965e5cd9d02fc2ecdab7c580fd457fea9 2025-08-14T21:20:15.5838206Z * [new tag] trunk/28ccc9e7247798980fe00a11bcd64a8016b5f227 -> trunk/28ccc9e7247798980fe00a11bcd64a8016b5f227 2025-08-14T21:20:15.5839384Z * [new tag] trunk/29712314dd5cf500a8ea3d1c69483a3cb768ca72 -> trunk/29712314dd5cf500a8ea3d1c69483a3cb768ca72 2025-08-14T21:20:15.5841048Z * [new tag] trunk/29d20d49f0b7f4e362e1cefdcdc4b5659969312c -> trunk/29d20d49f0b7f4e362e1cefdcdc4b5659969312c 2025-08-14T21:20:15.5842552Z * [new tag] trunk/2c5e10a5fceb208b11c3d569ae02e348b5893b31 -> trunk/2c5e10a5fceb208b11c3d569ae02e348b5893b31 2025-08-14T21:20:15.5843583Z * [new tag] trunk/2d0cdee394bccadcd0abe19dd4623ed978a331ad -> trunk/2d0cdee394bccadcd0abe19dd4623ed978a331ad 2025-08-14T21:20:15.5845304Z * [new tag] trunk/2e4e5ab4be9e0aeffd9c49b5b2f9f820bd0895b1 -> trunk/2e4e5ab4be9e0aeffd9c49b5b2f9f820bd0895b1 2025-08-14T21:20:15.5846775Z * [new tag] trunk/2ea40fba841b3af8103f332ba62e54f350ba9a51 -> trunk/2ea40fba841b3af8103f332ba62e54f350ba9a51 2025-08-14T21:20:15.5848070Z * [new tag] trunk/2ee22e435131369a7e4f8cc4732579acc29a941b -> trunk/2ee22e435131369a7e4f8cc4732579acc29a941b 2025-08-14T21:20:15.5849776Z * [new tag] trunk/2f4c2226175512af787725c4d5ad7313c60d4db1 -> trunk/2f4c2226175512af787725c4d5ad7313c60d4db1 2025-08-14T21:20:15.5850845Z * [new tag] trunk/3008d985a8fc155eb89374afff50cb33a6bd10d5 -> trunk/3008d985a8fc155eb89374afff50cb33a6bd10d5 2025-08-14T21:20:15.5852577Z * [new tag] trunk/3028fa6ce9d9c96671722ab8213a1a30670d7cf2 -> trunk/3028fa6ce9d9c96671722ab8213a1a30670d7cf2 2025-08-14T21:20:15.5853853Z * [new tag] trunk/303c614f3df95ae2b659c5f6c1838b14e4776ce6 -> trunk/303c614f3df95ae2b659c5f6c1838b14e4776ce6 2025-08-14T21:20:15.5855704Z * [new tag] trunk/305fa2239365ad17ac9c534a68bba8a149c42d67 -> trunk/305fa2239365ad17ac9c534a68bba8a149c42d67 2025-08-14T21:20:15.5857197Z * [new tag] trunk/31c9ac4319c0cc2ed8c6be701c6ccf73f6cb4706 -> trunk/31c9ac4319c0cc2ed8c6be701c6ccf73f6cb4706 2025-08-14T21:20:15.5858558Z * [new tag] trunk/32099961d588fc19ead8afe805d6b5108de75669 -> trunk/32099961d588fc19ead8afe805d6b5108de75669 2025-08-14T21:20:15.5860033Z * [new tag] trunk/32e5e2f596d55bb9441d5d53f3c58bcb55828047 -> trunk/32e5e2f596d55bb9441d5d53f3c58bcb55828047 2025-08-14T21:20:15.5861157Z * [new tag] trunk/334b38ccc4427b1d14981c48a3a0b92180d58225 -> trunk/334b38ccc4427b1d14981c48a3a0b92180d58225 2025-08-14T21:20:15.5862966Z * [new tag] trunk/334ecbd4ffe11858cae7d23d1190ddb4777c2513 -> trunk/334ecbd4ffe11858cae7d23d1190ddb4777c2513 2025-08-14T21:20:15.5864224Z * [new tag] trunk/33d94018668951611b318b7515ae96f04e48eac0 -> trunk/33d94018668951611b318b7515ae96f04e48eac0 2025-08-14T21:20:15.5865771Z * [new tag] trunk/34358f335d95213d96b6cca6a83e7bf3af6a9fcb -> trunk/34358f335d95213d96b6cca6a83e7bf3af6a9fcb 2025-08-14T21:20:15.5867216Z * [new tag] trunk/34ec5ed275f8aa875c80daa97b3e82af0b06f673 -> trunk/34ec5ed275f8aa875c80daa97b3e82af0b06f673 2025-08-14T21:20:15.5868489Z * [new tag] trunk/355462e1278d818deb9ef4a184073d5b66074816 -> trunk/355462e1278d818deb9ef4a184073d5b66074816 2025-08-14T21:20:15.5873756Z * [new tag] trunk/3626ba711b34397d1fbf0a9b1979f85cbf68b919 -> trunk/3626ba711b34397d1fbf0a9b1979f85cbf68b919 2025-08-14T21:20:15.5875180Z * [new tag] trunk/36f46d082a4954921cb8493223f000f2aab79ed7 -> trunk/36f46d082a4954921cb8493223f000f2aab79ed7 2025-08-14T21:20:15.5876630Z * [new tag] trunk/39aa3d1471549b7829c207d634dfdc1d26e346a2 -> trunk/39aa3d1471549b7829c207d634dfdc1d26e346a2 2025-08-14T21:20:15.5878252Z * [new tag] trunk/3a562374401113187ce2566b87e3f1d87d7c53aa -> trunk/3a562374401113187ce2566b87e3f1d87d7c53aa 2025-08-14T21:20:15.5879306Z * [new tag] trunk/3ac86e728dfaa7383ff7f865e9e7d33486188dae -> trunk/3ac86e728dfaa7383ff7f865e9e7d33486188dae 2025-08-14T21:20:15.5880703Z * [new tag] trunk/3be70dc30e893b552fc0f23ca06cd8f7949b6d08 -> trunk/3be70dc30e893b552fc0f23ca06cd8f7949b6d08 2025-08-14T21:20:15.5882747Z * [new tag] trunk/3cec82a7e9aea040a34dd7a2587ae6d3bd65dba0 -> trunk/3cec82a7e9aea040a34dd7a2587ae6d3bd65dba0 2025-08-14T21:20:15.5884169Z * [new tag] trunk/3cf7b4024ef83e44e9ae223dbff7c7ab68240cb2 -> trunk/3cf7b4024ef83e44e9ae223dbff7c7ab68240cb2 2025-08-14T21:20:15.5886163Z * [new tag] trunk/3ef2e1ef769582a82c6ddf150e9d11bf4bf1c44f -> trunk/3ef2e1ef769582a82c6ddf150e9d11bf4bf1c44f 2025-08-14T21:20:15.5887046Z * [new tag] trunk/3f1636ebef9b45e8a3cb0eb20d327ee6acb74be0 -> trunk/3f1636ebef9b45e8a3cb0eb20d327ee6acb74be0 2025-08-14T21:20:15.5888723Z * [new tag] trunk/3faee0a6318afcbbbb48687009a459214910d820 -> trunk/3faee0a6318afcbbbb48687009a459214910d820 2025-08-14T21:20:15.5890097Z * [new tag] trunk/3fcd79e023da7156ac584992ebab29205d3b7881 -> trunk/3fcd79e023da7156ac584992ebab29205d3b7881 2025-08-14T21:20:15.5891657Z * [new tag] trunk/3fe19a7a0af3f4d692af30476c320be18c7e8ae6 -> trunk/3fe19a7a0af3f4d692af30476c320be18c7e8ae6 2025-08-14T21:20:15.5893064Z * [new tag] trunk/41673110cd7c5960824cc74a6fcaeda1a8bc7a23 -> trunk/41673110cd7c5960824cc74a6fcaeda1a8bc7a23 2025-08-14T21:20:15.5894584Z * [new tag] trunk/4183d4ff3dcc1d87400326a9a7998c3f9e966f60 -> trunk/4183d4ff3dcc1d87400326a9a7998c3f9e966f60 2025-08-14T21:20:15.5895822Z * [new tag] trunk/422bd6808bb98cbbac31d157d9c82ad11ba9732d -> trunk/422bd6808bb98cbbac31d157d9c82ad11ba9732d 2025-08-14T21:20:15.5897310Z * [new tag] trunk/42e51cd4b3973a053fcfa80878a3f346fd158e9f -> trunk/42e51cd4b3973a053fcfa80878a3f346fd158e9f 2025-08-14T21:20:15.5898741Z * [new tag] trunk/4416433c7c625127b7f975c92f8ec98ea4c67fd3 -> trunk/4416433c7c625127b7f975c92f8ec98ea4c67fd3 2025-08-14T21:20:15.5900156Z * [new tag] trunk/45ba7ecda876685b083cbbe932450560c566826b -> trunk/45ba7ecda876685b083cbbe932450560c566826b 2025-08-14T21:20:15.5901902Z * [new tag] trunk/47a1db823dfcdacdb99f317428fc3791a18c5812 -> trunk/47a1db823dfcdacdb99f317428fc3791a18c5812 2025-08-14T21:20:15.5903148Z * [new tag] trunk/4a773e1e867f28a8ff0b15203e5cd9548f74fcee -> trunk/4a773e1e867f28a8ff0b15203e5cd9548f74fcee 2025-08-14T21:20:15.5904576Z * [new tag] trunk/4a90dc0c1f68d1f98832b169f792ed1bb195a0f3 -> trunk/4a90dc0c1f68d1f98832b169f792ed1bb195a0f3 2025-08-14T21:20:15.5906056Z * [new tag] trunk/4cde0acc0e4e795e1a12cbdd9b93c8c04c1fa05d -> trunk/4cde0acc0e4e795e1a12cbdd9b93c8c04c1fa05d 2025-08-14T21:20:15.5907491Z * [new tag] trunk/4d419a74610c32b1372f8802dcc61893740a23cf -> trunk/4d419a74610c32b1372f8802dcc61893740a23cf 2025-08-14T21:20:15.5909022Z * [new tag] trunk/4d5b3f2d5af7c8e4f41da4ffca53fafe8bb86235 -> trunk/4d5b3f2d5af7c8e4f41da4ffca53fafe8bb86235 2025-08-14T21:20:15.5910535Z * [new tag] trunk/4e2ddb5db67617f9f5309c8bba0c17adc84cadbc -> trunk/4e2ddb5db67617f9f5309c8bba0c17adc84cadbc 2025-08-14T21:20:15.5912064Z * [new tag] trunk/50a8c118754a6c5a46968f5c8e215ccba6831d42 -> trunk/50a8c118754a6c5a46968f5c8e215ccba6831d42 2025-08-14T21:20:15.5913552Z * [new tag] trunk/50f23ff6f883db5021dd6bab4c146434f98dd15d -> trunk/50f23ff6f883db5021dd6bab4c146434f98dd15d 2025-08-14T21:20:15.5914976Z * [new tag] trunk/515cb70367e84fcbad23fcc5b39eb1d7706df2aa -> trunk/515cb70367e84fcbad23fcc5b39eb1d7706df2aa 2025-08-14T21:20:15.5916465Z * [new tag] trunk/53e39494958b7e2278cc8176f63636e812e8945f -> trunk/53e39494958b7e2278cc8176f63636e812e8945f 2025-08-14T21:20:15.5917795Z * [new tag] trunk/556e2a73f4f0643f7c2aeb5c7dddda43388a40ce -> trunk/556e2a73f4f0643f7c2aeb5c7dddda43388a40ce 2025-08-14T21:20:15.5919182Z * [new tag] trunk/5665dc9ab76b84d7c90d845ffb0f6349b3621919 -> trunk/5665dc9ab76b84d7c90d845ffb0f6349b3621919 2025-08-14T21:20:15.5920568Z * [new tag] trunk/566c6d52ef1411c8262d7b9cf85e2044fdfbe1a3 -> trunk/566c6d52ef1411c8262d7b9cf85e2044fdfbe1a3 2025-08-14T21:20:15.5922042Z * [new tag] trunk/56c828bef93eada0e18d2cc013207831ca80cc99 -> trunk/56c828bef93eada0e18d2cc013207831ca80cc99 2025-08-14T21:20:15.5923479Z * [new tag] trunk/5737372862253a0ac0292407a5844796f02380ad -> trunk/5737372862253a0ac0292407a5844796f02380ad 2025-08-14T21:20:15.5924935Z * [new tag] trunk/57f738b6357cc8fcdde479a0948e723809a1a44d -> trunk/57f738b6357cc8fcdde479a0948e723809a1a44d 2025-08-14T21:20:15.5926448Z * [new tag] trunk/5a40c5784482255b9baf14086cc4b9349fc6d512 -> trunk/5a40c5784482255b9baf14086cc4b9349fc6d512 2025-08-14T21:20:15.5927907Z * [new tag] trunk/5a9c4cfce42b9eb87da0de40c5633f083115c307 -> trunk/5a9c4cfce42b9eb87da0de40c5633f083115c307 2025-08-14T21:20:15.5929404Z * [new tag] trunk/5ace061254af71aa83d1baae81aa1864c9746add -> trunk/5ace061254af71aa83d1baae81aa1864c9746add 2025-08-14T21:20:15.5930734Z * [new tag] trunk/5dddcd5b07c6644efca8d613f4eca1dc95daa87f -> trunk/5dddcd5b07c6644efca8d613f4eca1dc95daa87f 2025-08-14T21:20:15.5932360Z * [new tag] trunk/5ed4f9177907fe403ec4c4499d0d0e9be6b68fcf -> trunk/5ed4f9177907fe403ec4c4499d0d0e9be6b68fcf 2025-08-14T21:20:15.5933929Z * [new tag] trunk/5f1010fbb3850d99c8fdf9a9de2f79260cdc586a -> trunk/5f1010fbb3850d99c8fdf9a9de2f79260cdc586a 2025-08-14T21:20:15.5935191Z * [new tag] trunk/5f5f508aa836a46dfe88857fb223049616b94e93 -> trunk/5f5f508aa836a46dfe88857fb223049616b94e93 2025-08-14T21:20:15.5936711Z * [new tag] trunk/62bac0798100e0e06a86b7a4cee1788413e3d0ca -> trunk/62bac0798100e0e06a86b7a4cee1788413e3d0ca 2025-08-14T21:20:15.5938114Z * [new tag] trunk/63654ba4c5178fd12220cfc9d1c878af2fdd07cc -> trunk/63654ba4c5178fd12220cfc9d1c878af2fdd07cc 2025-08-14T21:20:15.5939555Z * [new tag] trunk/639778b3ee3b80e0894367fdc4442b58ae1b3a62 -> trunk/639778b3ee3b80e0894367fdc4442b58ae1b3a62 2025-08-14T21:20:15.5941002Z * [new tag] trunk/641ee7478150f26969968f49d8b358e199679a8a -> trunk/641ee7478150f26969968f49d8b358e199679a8a 2025-08-14T21:20:15.5942448Z * [new tag] trunk/65053c03a3d209060cb239d20a229dac37cf9dd1 -> trunk/65053c03a3d209060cb239d20a229dac37cf9dd1 2025-08-14T21:20:15.5943912Z * [new tag] trunk/652a6f5954d039d61dc6e6575ccf89d385d74537 -> trunk/652a6f5954d039d61dc6e6575ccf89d385d74537 2025-08-14T21:20:15.5945376Z * [new tag] trunk/685f15dbea66e8ffa8564752f81ad2f6cb447a14 -> trunk/685f15dbea66e8ffa8564752f81ad2f6cb447a14 2025-08-14T21:20:15.5946640Z * [new tag] trunk/68a4b4b2e336cfd4451ce6546d900568e5ddf96c -> trunk/68a4b4b2e336cfd4451ce6546d900568e5ddf96c 2025-08-14T21:20:15.5948404Z * [new tag] trunk/69a0a9aa7f5e320a02e97fa789d2f72baff1554f -> trunk/69a0a9aa7f5e320a02e97fa789d2f72baff1554f 2025-08-14T21:20:15.5949874Z * [new tag] trunk/6be6d06295c870c77a6eb69f96b3170d983520d5 -> trunk/6be6d06295c870c77a6eb69f96b3170d983520d5 2025-08-14T21:20:15.5951385Z * [new tag] trunk/6c05ea6475beaf3acc05e1bda0f3f8fe3bdc1d49 -> trunk/6c05ea6475beaf3acc05e1bda0f3f8fe3bdc1d49 2025-08-14T21:20:15.5952940Z * [new tag] trunk/6da11d9aafc0d84dc7f66030c181608ff2614f66 -> trunk/6da11d9aafc0d84dc7f66030c181608ff2614f66 2025-08-14T21:20:15.5954449Z * [new tag] trunk/6e8865fbc161270e2ffc52817e6c667df417a3f7 -> trunk/6e8865fbc161270e2ffc52817e6c667df417a3f7 2025-08-14T21:20:15.5956001Z * [new tag] trunk/6ea8376f84232048d6be0f7b2edf82aec1b61d58 -> trunk/6ea8376f84232048d6be0f7b2edf82aec1b61d58 2025-08-14T21:20:15.5957422Z * [new tag] trunk/6ee175195ac7853734d64704171993cc6265eb38 -> trunk/6ee175195ac7853734d64704171993cc6265eb38 2025-08-14T21:20:15.5958906Z * [new tag] trunk/6f0f4e0c3eacd479864319127915f869f64e1935 -> trunk/6f0f4e0c3eacd479864319127915f869f64e1935 2025-08-14T21:20:15.5960172Z * [new tag] trunk/70ccdec44b89e355a2cb03ba14a634284f7750f8 -> trunk/70ccdec44b89e355a2cb03ba14a634284f7750f8 2025-08-14T21:20:15.5961715Z * [new tag] trunk/72009ec6bebca7714f99c18449183787f202af4d -> trunk/72009ec6bebca7714f99c18449183787f202af4d 2025-08-14T21:20:15.5963245Z * [new tag] trunk/731ee31f7b6ba19307daab323f6196172b71aaf8 -> trunk/731ee31f7b6ba19307daab323f6196172b71aaf8 2025-08-14T21:20:15.5964826Z * [new tag] trunk/76a0609b6bddb2bc40f1eb4ade12885023653d59 -> trunk/76a0609b6bddb2bc40f1eb4ade12885023653d59 2025-08-14T21:20:15.5966276Z * [new tag] trunk/781e9a7724c47496e3d38a81e6dd6194cf098c41 -> trunk/781e9a7724c47496e3d38a81e6dd6194cf098c41 2025-08-14T21:20:15.5967799Z * [new tag] trunk/78a2fe1d42edeaa2ef7020b0fa0ac82ee4a640e4 -> trunk/78a2fe1d42edeaa2ef7020b0fa0ac82ee4a640e4 2025-08-14T21:20:15.5969227Z * [new tag] trunk/7a974a88f2c529a614baeabe4debd00fc8a3b299 -> trunk/7a974a88f2c529a614baeabe4debd00fc8a3b299 2025-08-14T21:20:15.5971121Z * [new tag] trunk/7ae0629d64b404e0ef5d9c931433ad25e65d6114 -> trunk/7ae0629d64b404e0ef5d9c931433ad25e65d6114 2025-08-14T21:20:15.5972519Z * [new tag] trunk/7d2ec704e47f4b740cdecda5534b305e8e1875ef -> trunk/7d2ec704e47f4b740cdecda5534b305e8e1875ef 2025-08-14T21:20:15.5973952Z * [new tag] trunk/7d87e358ac8440f666fabbfd99058bb5342be6ac -> trunk/7d87e358ac8440f666fabbfd99058bb5342be6ac 2025-08-14T21:20:15.5975280Z * [new tag] trunk/7e27347fd353928c99620495c8c531a5eba7d56b -> trunk/7e27347fd353928c99620495c8c531a5eba7d56b 2025-08-14T21:20:15.5977510Z * [new tag] trunk/7e91394955721c77645fcdb75a5d47a255d65020 -> trunk/7e91394955721c77645fcdb75a5d47a255d65020 2025-08-14T21:20:15.5978937Z * [new tag] trunk/7f4cb4a3e018a621add2a37a3a2f67b982d51001 -> trunk/7f4cb4a3e018a621add2a37a3a2f67b982d51001 2025-08-14T21:20:15.5980460Z * [new tag] trunk/7fbc22855c17741ae016992803b2e147a13aa22d -> trunk/7fbc22855c17741ae016992803b2e147a13aa22d 2025-08-14T21:20:15.5981955Z * [new tag] trunk/8047421fbb607d70ede13b9cd5a60b7b8bdfe348 -> trunk/8047421fbb607d70ede13b9cd5a60b7b8bdfe348 2025-08-14T21:20:15.5983427Z * [new tag] trunk/8088cfa592504a2897b4c78f8a46fe658ab5c2c2 -> trunk/8088cfa592504a2897b4c78f8a46fe658ab5c2c2 2025-08-14T21:20:15.5984999Z * [new tag] trunk/80cca8307943ba64168208b54028f55b2c71daff -> trunk/80cca8307943ba64168208b54028f55b2c71daff 2025-08-14T21:20:15.5986502Z * [new tag] trunk/8147370733bbdcd034cad54e9212e51885a11892 -> trunk/8147370733bbdcd034cad54e9212e51885a11892 2025-08-14T21:20:15.5988032Z * [new tag] trunk/83875cdb5594ccb3c9206b8eb5745fe1d011cf26 -> trunk/83875cdb5594ccb3c9206b8eb5745fe1d011cf26 2025-08-14T21:20:15.5989372Z * [new tag] trunk/8399cf88ce8399d2be93355f29d4cb69f51c0654 -> trunk/8399cf88ce8399d2be93355f29d4cb69f51c0654 2025-08-14T21:20:15.5990908Z * [new tag] trunk/842cc77ab9aafd518593c2fce077d6abb42a5b7f -> trunk/842cc77ab9aafd518593c2fce077d6abb42a5b7f 2025-08-14T21:20:15.5992421Z * [new tag] trunk/85db508af533649d0b3447ff3f0d5fe083150c84 -> trunk/85db508af533649d0b3447ff3f0d5fe083150c84 2025-08-14T21:20:15.5993789Z * [new tag] trunk/86eb65f7f06016bcd5d7951dc9d74bc3993a827a -> trunk/86eb65f7f06016bcd5d7951dc9d74bc3993a827a 2025-08-14T21:20:15.5995338Z * [new tag] trunk/87e6c4079d8ec7d04aff00ed82096b39836a8367 -> trunk/87e6c4079d8ec7d04aff00ed82096b39836a8367 2025-08-14T21:20:15.5996777Z * [new tag] trunk/89654db1abccf7e5f261989a150db4d1619ea2aa -> trunk/89654db1abccf7e5f261989a150db4d1619ea2aa 2025-08-14T21:20:15.5998124Z * [new tag] trunk/8a37f0c90392a2c38b7c5955471fa49edcaf5cb1 -> trunk/8a37f0c90392a2c38b7c5955471fa49edcaf5cb1 2025-08-14T21:20:15.5999590Z * [new tag] trunk/8ab5868a2199fe485c2d66533b9244ccb97e487d -> trunk/8ab5868a2199fe485c2d66533b9244ccb97e487d 2025-08-14T21:20:15.6001084Z * [new tag] trunk/8ae4d2652f64b8444b3d5314b9232bd2119bcde6 -> trunk/8ae4d2652f64b8444b3d5314b9232bd2119bcde6 2025-08-14T21:20:15.6002526Z * [new tag] trunk/8c41cb800ae0411f02ea5da34bd5ccc3790633b0 -> trunk/8c41cb800ae0411f02ea5da34bd5ccc3790633b0 2025-08-14T21:20:15.6004060Z * [new tag] trunk/8cb91e20bc205b1416648d0ffd98d1ba1f3a6fc4 -> trunk/8cb91e20bc205b1416648d0ffd98d1ba1f3a6fc4 2025-08-14T21:20:15.6005720Z * [new tag] trunk/8cfaf51d4e29c9bd9f49ecc98d955ed53df1a13d -> trunk/8cfaf51d4e29c9bd9f49ecc98d955ed53df1a13d 2025-08-14T21:20:15.6007033Z * [new tag] trunk/8d1cf529229dce7cd5ea04abb0faac83b87ca6d1 -> trunk/8d1cf529229dce7cd5ea04abb0faac83b87ca6d1 2025-08-14T21:20:15.6008635Z * [new tag] trunk/8d3d1c844303cb1d46123a1caa76d4cf83973347 -> trunk/8d3d1c844303cb1d46123a1caa76d4cf83973347 2025-08-14T21:20:15.6010001Z * [new tag] trunk/8d6d3246316e1767a57d5e855acd6208da753b75 -> trunk/8d6d3246316e1767a57d5e855acd6208da753b75 2025-08-14T21:20:15.6011466Z * [new tag] trunk/8e6a3138581152ab827a0997f34c470271399f5e -> trunk/8e6a3138581152ab827a0997f34c470271399f5e 2025-08-14T21:20:15.6012952Z * [new tag] trunk/8eee08d2279b98af2522debb6512d37e837e89e3 -> trunk/8eee08d2279b98af2522debb6512d37e837e89e3 2025-08-14T21:20:15.6014389Z * [new tag] trunk/90b78ee50f73b5c963996076a3d54b74b1b965be -> trunk/90b78ee50f73b5c963996076a3d54b74b1b965be 2025-08-14T21:20:15.6015704Z * [new tag] trunk/94b91a876327820a4bb6f5d39d156f13f2553ab6 -> trunk/94b91a876327820a4bb6f5d39d156f13f2553ab6 2025-08-14T21:20:15.6017656Z * [new tag] trunk/95210cc409dd578988c7116b47725c304dea54c7 -> trunk/95210cc409dd578988c7116b47725c304dea54c7 2025-08-14T21:20:15.6018877Z * [new tag] trunk/96bd33b2de79598566df395f32e27c4d33673f05 -> trunk/96bd33b2de79598566df395f32e27c4d33673f05 2025-08-14T21:20:15.6020444Z * [new tag] trunk/9708fcf92db88b80b9010c68662d634434da3106 -> trunk/9708fcf92db88b80b9010c68662d634434da3106 2025-08-14T21:20:15.6021963Z * [new tag] trunk/97c8c98f8dcb9c5c188b691d156e0043dba6c7f8 -> trunk/97c8c98f8dcb9c5c188b691d156e0043dba6c7f8 2025-08-14T21:20:15.6023511Z * [new tag] trunk/9903ca4f70bdc1653016256f5b4fd74fdfc609f8 -> trunk/9903ca4f70bdc1653016256f5b4fd74fdfc609f8 2025-08-14T21:20:15.6024982Z * [new tag] trunk/99bc2f94c1955657e950ebdad5f77e518785ccbd -> trunk/99bc2f94c1955657e950ebdad5f77e518785ccbd 2025-08-14T21:20:15.6026359Z * [new tag] trunk/9a06e6d0310da9d8a59ae05e8ec9c0201b55cacd -> trunk/9a06e6d0310da9d8a59ae05e8ec9c0201b55cacd 2025-08-14T21:20:15.6027969Z * [new tag] trunk/9a0f7a3bb01b235ea04581ee540970a098071b72 -> trunk/9a0f7a3bb01b235ea04581ee540970a098071b72 2025-08-14T21:20:15.6029520Z * [new tag] trunk/9b803cdbe298009f08340c1aaccb25aafbca95d8 -> trunk/9b803cdbe298009f08340c1aaccb25aafbca95d8 2025-08-14T21:20:15.6031159Z * [new tag] trunk/9ccd0f5e31ea54fcf42101dfbaacc103494e34df -> trunk/9ccd0f5e31ea54fcf42101dfbaacc103494e34df 2025-08-14T21:20:15.6032635Z * [new tag] trunk/9d37c960a4fc44d5ac334ca8bf775f85b95d76fc -> trunk/9d37c960a4fc44d5ac334ca8bf775f85b95d76fc 2025-08-14T21:20:15.6034142Z * [new tag] trunk/9e07673deb212c87b1c6fea23799a97474c476ed -> trunk/9e07673deb212c87b1c6fea23799a97474c476ed 2025-08-14T21:20:15.6035539Z * [new tag] trunk/9eedd2a20b64302d0d116ea2802b50948d2ebb09 -> trunk/9eedd2a20b64302d0d116ea2802b50948d2ebb09 2025-08-14T21:20:15.6037002Z * [new tag] trunk/9fa8ce26cf638504469852cbc3e7d04579fc8674 -> trunk/9fa8ce26cf638504469852cbc3e7d04579fc8674 2025-08-14T21:20:15.6038637Z * [new tag] trunk/a06ec54d40013c97fbffc174ea8f524ea5a95715 -> trunk/a06ec54d40013c97fbffc174ea8f524ea5a95715 2025-08-14T21:20:15.6040079Z * [new tag] trunk/a288b15ea9f87ddd665f249d492e0fb0861f5a69 -> trunk/a288b15ea9f87ddd665f249d492e0fb0861f5a69 2025-08-14T21:20:15.6041543Z * [new tag] trunk/a2fd106d670bb4990cebfd00f25ecbae4145e76c -> trunk/a2fd106d670bb4990cebfd00f25ecbae4145e76c 2025-08-14T21:20:15.6043126Z * [new tag] trunk/a354fa91e26b376d96385a2206c5ff5b42aa4600 -> trunk/a354fa91e26b376d96385a2206c5ff5b42aa4600 2025-08-14T21:20:15.6044670Z * [new tag] trunk/a4f69a5da08eace1c1e6469dec6a18aa842da73b -> trunk/a4f69a5da08eace1c1e6469dec6a18aa842da73b 2025-08-14T21:20:15.6046247Z * [new tag] trunk/a53d14d5f846ac44f6c205abb1c5bc4d2f3126ae -> trunk/a53d14d5f846ac44f6c205abb1c5bc4d2f3126ae 2025-08-14T21:20:15.6047890Z * [new tag] trunk/a5652407e4f3d772fc44486ac2abf756decf0861 -> trunk/a5652407e4f3d772fc44486ac2abf756decf0861 2025-08-14T21:20:15.6049822Z * [new tag] trunk/a7abf57aabec0ce686092e2d66e53ba185dbc56b -> trunk/a7abf57aabec0ce686092e2d66e53ba185dbc56b 2025-08-14T21:20:15.6051117Z * [new tag] trunk/a84b60c0c4016785fd93b7b8a0c04f2d0770d332 -> trunk/a84b60c0c4016785fd93b7b8a0c04f2d0770d332 2025-08-14T21:20:15.6052673Z * [new tag] trunk/aa75e917bdb0f95bb6dee81853c2d3c4ab3e1883 -> trunk/aa75e917bdb0f95bb6dee81853c2d3c4ab3e1883 2025-08-14T21:20:15.6054231Z * [new tag] trunk/adcca7d9a1c053495e99012de801b2ea237faad0 -> trunk/adcca7d9a1c053495e99012de801b2ea237faad0 2025-08-14T21:20:15.6055728Z * [new tag] trunk/af10f1f86cc4effc93142a447693d8be55966615 -> trunk/af10f1f86cc4effc93142a447693d8be55966615 2025-08-14T21:20:15.6057239Z * [new tag] trunk/af3cabc55d5699f4da528e1ca39d83338f84ae8c -> trunk/af3cabc55d5699f4da528e1ca39d83338f84ae8c 2025-08-14T21:20:15.6058818Z * [new tag] trunk/b0df7715e8c590c0001d1f9cdb97057be80c9107 -> trunk/b0df7715e8c590c0001d1f9cdb97057be80c9107 2025-08-14T21:20:15.6060348Z * [new tag] trunk/b149c7204c218e7c4d6594a89dd74f72bd480ec5 -> trunk/b149c7204c218e7c4d6594a89dd74f72bd480ec5 2025-08-14T21:20:15.6061946Z * [new tag] trunk/b1a602762e6a6674b406a3137e7e7a678885a97b -> trunk/b1a602762e6a6674b406a3137e7e7a678885a97b 2025-08-14T21:20:15.6063454Z * [new tag] trunk/b1f43548cad8fc0e30bda250f6e196310fa7a4bc -> trunk/b1f43548cad8fc0e30bda250f6e196310fa7a4bc 2025-08-14T21:20:15.6064989Z * [new tag] trunk/b219ca2a00a305753c4f1ea4c9c5d23243d54753 -> trunk/b219ca2a00a305753c4f1ea4c9c5d23243d54753 2025-08-14T21:20:15.6066494Z * [new tag] trunk/b4596895b9d85a686c2cb978938b0a7797b3690a -> trunk/b4596895b9d85a686c2cb978938b0a7797b3690a 2025-08-14T21:20:15.6068079Z * [new tag] trunk/b5fd7223b1bf44720dc9183bda7dfcf7aeccff02 -> trunk/b5fd7223b1bf44720dc9183bda7dfcf7aeccff02 2025-08-14T21:20:15.6069530Z * [new tag] trunk/b602ea9cab7d43a7ee7b4051227090f23fbd3dbf -> trunk/b602ea9cab7d43a7ee7b4051227090f23fbd3dbf 2025-08-14T21:20:15.6071142Z * [new tag] trunk/b6b74aed604bd2e96389ff99aaaf39abc64fdc64 -> trunk/b6b74aed604bd2e96389ff99aaaf39abc64fdc64 2025-08-14T21:20:15.6072648Z * [new tag] trunk/b7db86600a2614adc71c92ca42d359a7ac534d78 -> trunk/b7db86600a2614adc71c92ca42d359a7ac534d78 2025-08-14T21:20:15.6075113Z * [new tag] trunk/b9003ed3d87699e81e436719625a21996a6654e5 -> trunk/b9003ed3d87699e81e436719625a21996a6654e5 2025-08-14T21:20:15.6076584Z * [new tag] trunk/b90feeac86bda00afc2789321bcd706015ff44e3 -> trunk/b90feeac86bda00afc2789321bcd706015ff44e3 2025-08-14T21:20:15.6078879Z * [new tag] trunk/b9d7de3a094598c3dc0dd52e57bce30eb684c9d8 -> trunk/b9d7de3a094598c3dc0dd52e57bce30eb684c9d8 2025-08-14T21:20:15.6080269Z * [new tag] trunk/ba47821f524eee50a214ed39fa2e7765d54aabf4 -> trunk/ba47821f524eee50a214ed39fa2e7765d54aabf4 2025-08-14T21:20:15.6081728Z * [new tag] trunk/ba4ccf5d67e3d237f435eacc2bce3c6025f08491 -> trunk/ba4ccf5d67e3d237f435eacc2bce3c6025f08491 2025-08-14T21:20:15.6083320Z * [new tag] trunk/bcf23ecc476df2bd7479f142567213e2623308ee -> trunk/bcf23ecc476df2bd7479f142567213e2623308ee 2025-08-14T21:20:15.6084839Z * [new tag] trunk/be53f609aaf6f01e2863f490975ea9eaac3ee9ff -> trunk/be53f609aaf6f01e2863f490975ea9eaac3ee9ff 2025-08-14T21:20:15.6086383Z * [new tag] trunk/beb4d7816dedc67a5de1f82e5a45b5910f407941 -> trunk/beb4d7816dedc67a5de1f82e5a45b5910f407941 2025-08-14T21:20:15.6087943Z * [new tag] trunk/bfc873d02ec413344717493e4175a902921359fd -> trunk/bfc873d02ec413344717493e4175a902921359fd 2025-08-14T21:20:15.6097059Z * [new tag] trunk/c184cb3852f0ff2d16a489d61abc3739c309e6ca -> trunk/c184cb3852f0ff2d16a489d61abc3739c309e6ca 2025-08-14T21:20:15.6097901Z * [new tag] trunk/c24ca7f4bf79f62fd623d76346ca27e53f731431 -> trunk/c24ca7f4bf79f62fd623d76346ca27e53f731431 2025-08-14T21:20:15.6098918Z * [new tag] trunk/c3dc8dc4122977893004c49d10e4676cd0a97da4 -> trunk/c3dc8dc4122977893004c49d10e4676cd0a97da4 2025-08-14T21:20:15.6099787Z * [new tag] trunk/c5ec5458a547f7a774468ea0eb2258d3de596492 -> trunk/c5ec5458a547f7a774468ea0eb2258d3de596492 2025-08-14T21:20:15.6100623Z * [new tag] trunk/c5efc5c8a66eca84865015058b3221013ebfe685 -> trunk/c5efc5c8a66eca84865015058b3221013ebfe685 2025-08-14T21:20:15.6101529Z * [new tag] trunk/c6563341208003f64c131854a9cf029555f786d2 -> trunk/c6563341208003f64c131854a9cf029555f786d2 2025-08-14T21:20:15.6102322Z * [new tag] trunk/c6d78d4dbda53837d298d23a5fbc09af90a42d9e -> trunk/c6d78d4dbda53837d298d23a5fbc09af90a42d9e 2025-08-14T21:20:15.6103329Z * [new tag] trunk/c8205cb35435f39d2c26f6c94b45e4adeb6dcb23 -> trunk/c8205cb35435f39d2c26f6c94b45e4adeb6dcb23 2025-08-14T21:20:15.6104128Z * [new tag] trunk/c859ba7114b1fcb49527e090745fa17091d1f8d5 -> trunk/c859ba7114b1fcb49527e090745fa17091d1f8d5 2025-08-14T21:20:15.6105059Z * [new tag] trunk/c86040a8e68f754b90a84099187d3624954c7f36 -> trunk/c86040a8e68f754b90a84099187d3624954c7f36 2025-08-14T21:20:15.6105941Z * [new tag] trunk/c9671dc865aa0fc1cb86df754e355b44d8e02bb4 -> trunk/c9671dc865aa0fc1cb86df754e355b44d8e02bb4 2025-08-14T21:20:15.6106893Z * [new tag] trunk/ca7315c17162ea21b1ca5ba23f4bf6168766c7b9 -> trunk/ca7315c17162ea21b1ca5ba23f4bf6168766c7b9 2025-08-14T21:20:15.6107958Z * [new tag] trunk/cae2b5e3d223829bdc553fc8601df4b1c1554cff -> trunk/cae2b5e3d223829bdc553fc8601df4b1c1554cff 2025-08-14T21:20:15.6109444Z * [new tag] trunk/cbffde774557752cf20447d42d99ec6102673c31 -> trunk/cbffde774557752cf20447d42d99ec6102673c31 2025-08-14T21:20:15.6110993Z * [new tag] trunk/cd8d8c18f5bafdc1c73d5ac0129e7b4d76ab45bc -> trunk/cd8d8c18f5bafdc1c73d5ac0129e7b4d76ab45bc 2025-08-14T21:20:15.6112506Z * [new tag] trunk/cf0a0dcb0afa5e84b95461cc542f862b51ca96bf -> trunk/cf0a0dcb0afa5e84b95461cc542f862b51ca96bf 2025-08-14T21:20:15.6113967Z * [new tag] trunk/cf4964be68fa9f4ffc334f01cce42d7424b1cc81 -> trunk/cf4964be68fa9f4ffc334f01cce42d7424b1cc81 2025-08-14T21:20:15.6115892Z * [new tag] trunk/d0e2240f680ea2a553f7ee8188f52482e130bfd0 -> trunk/d0e2240f680ea2a553f7ee8188f52482e130bfd0 2025-08-14T21:20:15.6117231Z * [new tag] trunk/d1950d4bb5cba8fb6b23e4d283eea5b9801737e2 -> trunk/d1950d4bb5cba8fb6b23e4d283eea5b9801737e2 2025-08-14T21:20:15.6118742Z * [new tag] trunk/d20c4c20e61adecf00335c4d8c22eb1ace472cd3 -> trunk/d20c4c20e61adecf00335c4d8c22eb1ace472cd3 2025-08-14T21:20:15.6120236Z * [new tag] trunk/d25c4f954d599ea512e2f70cd6df101c21479d4c -> trunk/d25c4f954d599ea512e2f70cd6df101c21479d4c 2025-08-14T21:20:15.6121779Z * [new tag] trunk/d3d359dbafa89173a371e2637f22b47398e94a24 -> trunk/d3d359dbafa89173a371e2637f22b47398e94a24 2025-08-14T21:20:15.6123297Z * [new tag] trunk/d46768db04499d07a5b0db984112a6d1b7d3b0c1 -> trunk/d46768db04499d07a5b0db984112a6d1b7d3b0c1 2025-08-14T21:20:15.6125050Z * [new tag] trunk/d4c1a08c89f37d249a0146ff511c82ecc5c53b8f -> trunk/d4c1a08c89f37d249a0146ff511c82ecc5c53b8f 2025-08-14T21:20:15.6126603Z * [new tag] trunk/d556586448f3caab85673c7da0978fe31c7748f7 -> trunk/d556586448f3caab85673c7da0978fe31c7748f7 2025-08-14T21:20:15.6128449Z * [new tag] trunk/d670304001429a1a833255a918ed788d7ec4989a -> trunk/d670304001429a1a833255a918ed788d7ec4989a 2025-08-14T21:20:15.6129866Z * [new tag] trunk/d6786741a77aba200c78002646cc069b7a1799b0 -> trunk/d6786741a77aba200c78002646cc069b7a1799b0 2025-08-14T21:20:15.6131806Z * [new tag] trunk/d68c323692dedcbb74e670801e3502944fd790ff -> trunk/d68c323692dedcbb74e670801e3502944fd790ff 2025-08-14T21:20:15.6133086Z * [new tag] trunk/d8cb3db5339b45e4b745b2b883ef3ecde9843e2c -> trunk/d8cb3db5339b45e4b745b2b883ef3ecde9843e2c 2025-08-14T21:20:15.6134385Z * [new tag] trunk/da1f608ca33f3062535d0a4866d95db19e72fcbd -> trunk/da1f608ca33f3062535d0a4866d95db19e72fcbd 2025-08-14T21:20:15.6135754Z * [new tag] trunk/db0b7f1cc9bb3fe71aaf8b964a644147ae8e1c35 -> trunk/db0b7f1cc9bb3fe71aaf8b964a644147ae8e1c35 2025-08-14T21:20:15.6137322Z * [new tag] trunk/db32b60662b2f2bdcad980127d5dc4b66b02a7e4 -> trunk/db32b60662b2f2bdcad980127d5dc4b66b02a7e4 2025-08-14T21:20:15.6138824Z * [new tag] trunk/db763b17175553ba09637362eb9773a91997a7ad -> trunk/db763b17175553ba09637362eb9773a91997a7ad 2025-08-14T21:20:15.6140343Z * [new tag] trunk/db78943a1ca13a32a3d6045eb15e2b719ee13a2f -> trunk/db78943a1ca13a32a3d6045eb15e2b719ee13a2f 2025-08-14T21:20:15.6141879Z * [new tag] trunk/dc0d18e023d9b7e314ebba0f234b6cb1579dbcfd -> trunk/dc0d18e023d9b7e314ebba0f234b6cb1579dbcfd 2025-08-14T21:20:15.6143364Z * [new tag] trunk/dd21c8a578038ab2841a7ba809a06921093ac9d8 -> trunk/dd21c8a578038ab2841a7ba809a06921093ac9d8 2025-08-14T21:20:15.6144911Z * [new tag] trunk/deea71a90e05eb320c04bebfead5317746637f0d -> trunk/deea71a90e05eb320c04bebfead5317746637f0d 2025-08-14T21:20:15.6146394Z * [new tag] trunk/df55ec7d4b35f6d21691e9dd41c82f27de762948 -> trunk/df55ec7d4b35f6d21691e9dd41c82f27de762948 2025-08-14T21:20:15.6147935Z * [new tag] trunk/e1cf0d496ea85d1807c8c740f296e77bf7bdc1df -> trunk/e1cf0d496ea85d1807c8c740f296e77bf7bdc1df 2025-08-14T21:20:15.6150789Z * [new tag] trunk/e248719ac03c103767ab72034f6b9fd56855bf98 -> trunk/e248719ac03c103767ab72034f6b9fd56855bf98 2025-08-14T21:20:15.6152272Z * [new tag] trunk/e49762026070f66be41bfa6537fbcf9bfc24e558 -> trunk/e49762026070f66be41bfa6537fbcf9bfc24e558 2025-08-14T21:20:15.6153837Z * [new tag] trunk/e4de93f6a3e342bab34d3757cf90ec0ccc87e168 -> trunk/e4de93f6a3e342bab34d3757cf90ec0ccc87e168 2025-08-14T21:20:15.6155140Z * [new tag] trunk/e619c6bb90b9dedaccd3cbeed86a288993a4e33f -> trunk/e619c6bb90b9dedaccd3cbeed86a288993a4e33f 2025-08-14T21:20:15.6156698Z * [new tag] trunk/e63c2b21c186a7d2ab8a8953b8aa1535f2e96e58 -> trunk/e63c2b21c186a7d2ab8a8953b8aa1535f2e96e58 2025-08-14T21:20:15.6158197Z * [new tag] trunk/e7152ff8a6a929a0db7f3f4a72a5b6d471769cd3 -> trunk/e7152ff8a6a929a0db7f3f4a72a5b6d471769cd3 2025-08-14T21:20:15.6159808Z * [new tag] trunk/e96c7c4bb0f6aeae2ab3b6f040f7d67edbec199a -> trunk/e96c7c4bb0f6aeae2ab3b6f040f7d67edbec199a 2025-08-14T21:20:15.6161277Z * [new tag] trunk/e9eb2096a59a79e7a94c3e28a0715e040369f34c -> trunk/e9eb2096a59a79e7a94c3e28a0715e040369f34c 2025-08-14T21:20:15.6162780Z * [new tag] trunk/eac2d9d695a32dd456050f45cac35134ec3809f4 -> trunk/eac2d9d695a32dd456050f45cac35134ec3809f4 2025-08-14T21:20:15.6164243Z * [new tag] trunk/ecde76c764752540edf9ef62a97936c86d984b17 -> trunk/ecde76c764752540edf9ef62a97936c86d984b17 2025-08-14T21:20:15.6165789Z * [new tag] trunk/ecea81117b2fdc52907c97b3c32d779e07b5d55b -> trunk/ecea81117b2fdc52907c97b3c32d779e07b5d55b 2025-08-14T21:20:15.6167376Z * [new tag] trunk/edaa151d0d5a4e75fbec9843f49cc78770eb61fb -> trunk/edaa151d0d5a4e75fbec9843f49cc78770eb61fb 2025-08-14T21:20:15.6168943Z * [new tag] trunk/ee1b0412b919dfb358d5a697b3be49621497fbc2 -> trunk/ee1b0412b919dfb358d5a697b3be49621497fbc2 2025-08-14T21:20:15.6170484Z * [new tag] trunk/ee1fb43450c2e985657f95a91b68328d6f20f24e -> trunk/ee1fb43450c2e985657f95a91b68328d6f20f24e 2025-08-14T21:20:15.6172180Z * [new tag] trunk/ee89cc7a0acd69de25f98fe4ef828546db7b444c -> trunk/ee89cc7a0acd69de25f98fe4ef828546db7b444c 2025-08-14T21:20:15.6173624Z * [new tag] trunk/ee9f8ba11d664b871a9e0c7933fdc8571635b78c -> trunk/ee9f8ba11d664b871a9e0c7933fdc8571635b78c 2025-08-14T21:20:15.6175772Z * [new tag] trunk/eed9dbf70f43ee529fec78ac00ed9a4fd74c6e76 -> trunk/eed9dbf70f43ee529fec78ac00ed9a4fd74c6e76 2025-08-14T21:20:15.6177109Z * [new tag] trunk/f077c2402e4eb5b0ed562b4ee5b7a0503f26ef94 -> trunk/f077c2402e4eb5b0ed562b4ee5b7a0503f26ef94 2025-08-14T21:20:15.6178672Z * [new tag] trunk/f0980fc0bbd656d6c02d23ad97e945353b314f35 -> trunk/f0980fc0bbd656d6c02d23ad97e945353b314f35 2025-08-14T21:20:15.6180193Z * [new tag] trunk/f15ada5c6fad97a7dcbfa4673f067b6942dda640 -> trunk/f15ada5c6fad97a7dcbfa4673f067b6942dda640 2025-08-14T21:20:15.6181750Z * [new tag] trunk/f27232a2134150cb5e55d26a74d8c36c6a961ca5 -> trunk/f27232a2134150cb5e55d26a74d8c36c6a961ca5 2025-08-14T21:20:15.6183274Z * [new tag] trunk/f33ce40bc062a281e1a1f57e8c1926d0a7d155cc -> trunk/f33ce40bc062a281e1a1f57e8c1926d0a7d155cc 2025-08-14T21:20:15.6184853Z * [new tag] trunk/f341077ce4710172da20cfad916ee37159bfe9fe -> trunk/f341077ce4710172da20cfad916ee37159bfe9fe 2025-08-14T21:20:15.6186391Z * [new tag] trunk/f3a4d742ece08de4cb0e59dcc62e0093a7d0b0c7 -> trunk/f3a4d742ece08de4cb0e59dcc62e0093a7d0b0c7 2025-08-14T21:20:15.6187893Z * [new tag] trunk/f3f159ff8c4bad2edec99c68a941c628e983d04c -> trunk/f3f159ff8c4bad2edec99c68a941c628e983d04c 2025-08-14T21:20:15.6189408Z * [new tag] trunk/f60454cce8b93e5bbf67f2f3c88c8ac01ed65457 -> trunk/f60454cce8b93e5bbf67f2f3c88c8ac01ed65457 2025-08-14T21:20:15.6190862Z * [new tag] trunk/f7b2f3314cf7aede67d5fa5c75e4243208484344 -> trunk/f7b2f3314cf7aede67d5fa5c75e4243208484344 2025-08-14T21:20:15.6192459Z * [new tag] trunk/f8f0414a5983ff481a2188e0c18594150430c8c5 -> trunk/f8f0414a5983ff481a2188e0c18594150430c8c5 2025-08-14T21:20:15.6193945Z * [new tag] trunk/f95b58c2844b3444cd8446fed8570729dc4216eb -> trunk/f95b58c2844b3444cd8446fed8570729dc4216eb 2025-08-14T21:20:15.6195483Z * [new tag] trunk/f990490a23815ea6ee27e487c70ba2cf513ba43d -> trunk/f990490a23815ea6ee27e487c70ba2cf513ba43d 2025-08-14T21:20:15.6196963Z * [new tag] trunk/fb887c3bb588cfe782615e67f6c26db636b8539b -> trunk/fb887c3bb588cfe782615e67f6c26db636b8539b 2025-08-14T21:20:15.6199314Z * [new tag] trunk/fc25c68f20f772290927a7031b998b92615259cf -> trunk/fc25c68f20f772290927a7031b998b92615259cf 2025-08-14T21:20:15.6200644Z * [new tag] trunk/fc80f6859e0ccf66513a40f04b9e735e759d4ddb -> trunk/fc80f6859e0ccf66513a40f04b9e735e759d4ddb 2025-08-14T21:20:15.6202260Z * [new tag] trunk/fdfd69bb05488d76123db9cc1cdd90ac4137bbfb -> trunk/fdfd69bb05488d76123db9cc1cdd90ac4137bbfb 2025-08-14T21:20:15.6204407Z * [new tag] trunk/fe3f5fe4ea2ff6f56406dc5d954636ebb08d0a08 -> trunk/fe3f5fe4ea2ff6f56406dc5d954636ebb08d0a08 2025-08-14T21:20:15.6205786Z * [new tag] trunk/fea7e9dd37c02c334b130f6624af6163fde6b2ab -> trunk/fea7e9dd37c02c334b130f6624af6163fde6b2ab 2025-08-14T21:20:15.6207166Z * [new tag] trunk/ff0d56d03592aa03f3ced8359241d21df1783393 -> trunk/ff0d56d03592aa03f3ced8359241d21df1783393 2025-08-14T21:20:15.6208453Z * [new tag] v0.1.1 -> v0.1.1 2025-08-14T21:20:15.6209945Z * [new tag] v0.1.10 -> v0.1.10 2025-08-14T21:20:15.6211400Z * [new tag] v0.1.11 -> v0.1.11 2025-08-14T21:20:15.6212653Z * [new tag] v0.1.12 -> v0.1.12 2025-08-14T21:20:15.6214071Z * [new tag] v0.1.2 -> v0.1.2 2025-08-14T21:20:15.6215455Z * [new tag] v0.1.3 -> v0.1.3 2025-08-14T21:20:15.6216783Z * [new tag] v0.1.4 -> v0.1.4 2025-08-14T21:20:15.6218193Z * [new tag] v0.1.5 -> v0.1.5 2025-08-14T21:20:15.6219595Z * [new tag] v0.1.6 -> v0.1.6 2025-08-14T21:20:15.6220761Z * [new tag] v0.1.7 -> v0.1.7 2025-08-14T21:20:15.6222197Z * [new tag] v0.1.8 -> v0.1.8 2025-08-14T21:20:15.6223340Z * [new tag] v0.1.9 -> v0.1.9 2025-08-14T21:20:15.6224832Z * [new tag] v0.2.0 -> v0.2.0 2025-08-14T21:20:15.6226354Z * [new tag] v0.3.0 -> v0.3.0 2025-08-14T21:20:15.6227801Z * [new tag] v0.3.1 -> v0.3.1 2025-08-14T21:20:15.6229187Z * [new tag] v0.4.0 -> v0.4.0 2025-08-14T21:20:15.6230407Z * [new tag] v0.4.1 -> v0.4.1 2025-08-14T21:20:15.6231895Z * [new tag] v1.0.0 -> v1.0.0 2025-08-14T21:20:15.6233309Z * [new tag] v1.0.0a0 -> v1.0.0a0 2025-08-14T21:20:15.6234722Z * [new tag] v1.0.1 -> v1.0.1 2025-08-14T21:20:15.6236197Z * [new tag] v1.0rc0 -> v1.0rc0 2025-08-14T21:20:15.6237931Z * [new tag] v1.0rc1 -> v1.0rc1 2025-08-14T21:20:15.6239378Z * [new tag] v1.1.0 -> v1.1.0 2025-08-14T21:20:15.6240779Z * [new tag] v1.1.0a0 -> v1.1.0a0 2025-08-14T21:20:15.6242370Z * [new tag] v1.10.0 -> v1.10.0 2025-08-14T21:20:15.6243923Z * [new tag] v1.10.0-rc1 -> v1.10.0-rc1 2025-08-14T21:20:15.6245438Z * [new tag] v1.10.0-rc2 -> v1.10.0-rc2 2025-08-14T21:20:15.6246551Z * [new tag] v1.10.0-rc3 -> v1.10.0-rc3 2025-08-14T21:20:15.6248270Z * [new tag] v1.10.1 -> v1.10.1 2025-08-14T21:20:15.6249329Z * [new tag] v1.10.1-rc1 -> v1.10.1-rc1 2025-08-14T21:20:15.6250507Z * [new tag] v1.10.2 -> v1.10.2 2025-08-14T21:20:15.6251723Z * [new tag] v1.10.2-rc1 -> v1.10.2-rc1 2025-08-14T21:20:15.6253295Z * [new tag] v1.11.0 -> v1.11.0 2025-08-14T21:20:15.6254754Z * [new tag] v1.11.0-rc1 -> v1.11.0-rc1 2025-08-14T21:20:15.6256377Z * [new tag] v1.11.0-rc2 -> v1.11.0-rc2 2025-08-14T21:20:15.6257845Z * [new tag] v1.11.0-rc3 -> v1.11.0-rc3 2025-08-14T21:20:15.6259328Z * [new tag] v1.11.0-rc4 -> v1.11.0-rc4 2025-08-14T21:20:15.6260814Z * [new tag] v1.11.0-rc5 -> v1.11.0-rc5 2025-08-14T21:20:15.6261916Z * [new tag] v1.11.0-rc6 -> v1.11.0-rc6 2025-08-14T21:20:15.6263073Z * [new tag] v1.11.0-rc7 -> v1.11.0-rc7 2025-08-14T21:20:15.6264567Z * [new tag] v1.12.0 -> v1.12.0 2025-08-14T21:20:15.6266198Z * [new tag] v1.12.0-rc1 -> v1.12.0-rc1 2025-08-14T21:20:15.6267160Z * [new tag] v1.12.0-rc2 -> v1.12.0-rc2 2025-08-14T21:20:15.6268848Z * [new tag] v1.12.0-rc3 -> v1.12.0-rc3 2025-08-14T21:20:15.6270315Z * [new tag] v1.12.0-rc4 -> v1.12.0-rc4 2025-08-14T21:20:15.6271806Z * [new tag] v1.12.0-rc5 -> v1.12.0-rc5 2025-08-14T21:20:15.6273244Z * [new tag] v1.12.0-rc6 -> v1.12.0-rc6 2025-08-14T21:20:15.6274216Z * [new tag] v1.12.0-rc7 -> v1.12.0-rc7 2025-08-14T21:20:15.6275551Z * [new tag] v1.12.0-rc8 -> v1.12.0-rc8 2025-08-14T21:20:15.6276548Z * [new tag] v1.12.1 -> v1.12.1 2025-08-14T21:20:15.6278331Z * [new tag] v1.12.1-rc1 -> v1.12.1-rc1 2025-08-14T21:20:15.6279759Z * [new tag] v1.12.1-rc2 -> v1.12.1-rc2 2025-08-14T21:20:15.6281257Z * [new tag] v1.12.1-rc3 -> v1.12.1-rc3 2025-08-14T21:20:15.6282413Z * [new tag] v1.12.1-rc4 -> v1.12.1-rc4 2025-08-14T21:20:15.6283715Z * [new tag] v1.12.1-rc5 -> v1.12.1-rc5 2025-08-14T21:20:15.6285534Z * [new tag] v1.13.0 -> v1.13.0 2025-08-14T21:20:15.6286684Z * [new tag] v1.13.0-rc1 -> v1.13.0-rc1 2025-08-14T21:20:15.6288358Z * [new tag] v1.13.0-rc2 -> v1.13.0-rc2 2025-08-14T21:20:15.6289455Z * [new tag] v1.13.0-rc3 -> v1.13.0-rc3 2025-08-14T21:20:15.6291265Z * [new tag] v1.13.0-rc4 -> v1.13.0-rc4 2025-08-14T21:20:15.6292239Z * [new tag] v1.13.0-rc5 -> v1.13.0-rc5 2025-08-14T21:20:15.6293348Z * [new tag] v1.13.0-rc6 -> v1.13.0-rc6 2025-08-14T21:20:15.6294956Z * [new tag] v1.13.1 -> v1.13.1 2025-08-14T21:20:15.6295969Z * [new tag] v1.13.1-rc1 -> v1.13.1-rc1 2025-08-14T21:20:15.6297582Z * [new tag] v1.2.0 -> v1.2.0 2025-08-14T21:20:15.6298964Z * [new tag] v1.2.0a0 -> v1.2.0a0 2025-08-14T21:20:15.6300256Z * [new tag] v1.3.0 -> v1.3.0 2025-08-14T21:20:15.6301789Z * [new tag] v1.3.0a0 -> v1.3.0a0 2025-08-14T21:20:15.6302764Z * [new tag] v1.3.1 -> v1.3.1 2025-08-14T21:20:15.6304363Z * [new tag] v1.4.0 -> v1.4.0 2025-08-14T21:20:15.6305658Z * [new tag] v1.4.0a0 -> v1.4.0a0 2025-08-14T21:20:15.6306718Z * [new tag] v1.4.1 -> v1.4.1 2025-08-14T21:20:15.6308428Z * [new tag] v1.5.0 -> v1.5.0 2025-08-14T21:20:15.6309977Z * [new tag] v1.5.0-rc1 -> v1.5.0-rc1 2025-08-14T21:20:15.6311469Z * [new tag] v1.5.0-rc2 -> v1.5.0-rc2 2025-08-14T21:20:15.6312960Z * [new tag] v1.5.0-rc3 -> v1.5.0-rc3 2025-08-14T21:20:15.6314035Z * [new tag] v1.5.0-rc4 -> v1.5.0-rc4 2025-08-14T21:20:15.6315552Z * [new tag] v1.5.0-rc5 -> v1.5.0-rc5 2025-08-14T21:20:15.6317023Z * [new tag] v1.5.1 -> v1.5.1 2025-08-14T21:20:15.6318025Z * [new tag] v1.5.1-rc1 -> v1.5.1-rc1 2025-08-14T21:20:15.6319381Z * [new tag] v1.6.0 -> v1.6.0 2025-08-14T21:20:15.6320877Z * [new tag] v1.6.0-rc1 -> v1.6.0-rc1 2025-08-14T21:20:15.6322379Z * [new tag] v1.6.0-rc2 -> v1.6.0-rc2 2025-08-14T21:20:15.6323689Z * [new tag] v1.6.0-rc3 -> v1.6.0-rc3 2025-08-14T21:20:15.6325378Z * [new tag] v1.6.0-rc4 -> v1.6.0-rc4 2025-08-14T21:20:15.6327268Z * [new tag] v1.6.0-rc5 -> v1.6.0-rc5 2025-08-14T21:20:15.6328639Z * [new tag] v1.6.0-rc6 -> v1.6.0-rc6 2025-08-14T21:20:15.6329626Z * [new tag] v1.6.0-rc7 -> v1.6.0-rc7 2025-08-14T21:20:15.6331297Z * [new tag] v1.7.0 -> v1.7.0 2025-08-14T21:20:15.6332921Z * [new tag] v1.7.0-rc1 -> v1.7.0-rc1 2025-08-14T21:20:15.6334086Z * [new tag] v1.7.0-rc2 -> v1.7.0-rc2 2025-08-14T21:20:15.6335721Z * [new tag] v1.7.0-rc3 -> v1.7.0-rc3 2025-08-14T21:20:15.6336678Z * [new tag] v1.7.0-rc4 -> v1.7.0-rc4 2025-08-14T21:20:15.6338293Z * [new tag] v1.7.1 -> v1.7.1 2025-08-14T21:20:15.6339919Z * [new tag] v1.7.1-rc1 -> v1.7.1-rc1 2025-08-14T21:20:15.6341285Z * [new tag] v1.7.1-rc2 -> v1.7.1-rc2 2025-08-14T21:20:15.6342352Z * [new tag] v1.7.1-rc3 -> v1.7.1-rc3 2025-08-14T21:20:15.6343994Z * [new tag] v1.8.0 -> v1.8.0 2025-08-14T21:20:15.6345000Z * [new tag] v1.8.0-rc1 -> v1.8.0-rc1 2025-08-14T21:20:15.6346690Z * [new tag] v1.8.0-rc2 -> v1.8.0-rc2 2025-08-14T21:20:15.6348297Z * [new tag] v1.8.0-rc3 -> v1.8.0-rc3 2025-08-14T21:20:15.6349445Z * [new tag] v1.8.0-rc4 -> v1.8.0-rc4 2025-08-14T21:20:15.6350755Z * [new tag] v1.8.0-rc5 -> v1.8.0-rc5 2025-08-14T21:20:15.6351831Z * [new tag] v1.8.1 -> v1.8.1 2025-08-14T21:20:15.6353514Z * [new tag] v1.8.1-rc1 -> v1.8.1-rc1 2025-08-14T21:20:15.6354515Z * [new tag] v1.8.1-rc2 -> v1.8.1-rc2 2025-08-14T21:20:15.6355655Z * [new tag] v1.8.1-rc3 -> v1.8.1-rc3 2025-08-14T21:20:15.6357908Z * [new tag] v1.8.2 -> v1.8.2 2025-08-14T21:20:15.6358881Z * [new tag] v1.8.2-rc1 -> v1.8.2-rc1 2025-08-14T21:20:15.6360648Z * [new tag] v1.9.0 -> v1.9.0 2025-08-14T21:20:15.6361790Z * [new tag] v1.9.0-rc1 -> v1.9.0-rc1 2025-08-14T21:20:15.6363549Z * [new tag] v1.9.0-rc2 -> v1.9.0-rc2 2025-08-14T21:20:15.6365124Z * [new tag] v1.9.0-rc3 -> v1.9.0-rc3 2025-08-14T21:20:15.6366146Z * [new tag] v1.9.0-rc4 -> v1.9.0-rc4 2025-08-14T21:20:15.6367800Z * [new tag] v1.9.1 -> v1.9.1 2025-08-14T21:20:15.6369375Z * [new tag] v1.9.1-rc1 -> v1.9.1-rc1 2025-08-14T21:20:15.6370435Z * [new tag] v1.9.1-rc2 -> v1.9.1-rc2 2025-08-14T21:20:15.6372105Z * [new tag] v2.0.0 -> v2.0.0 2025-08-14T21:20:15.6373239Z * [new tag] v2.0.0-rc1 -> v2.0.0-rc1 2025-08-14T21:20:15.6374941Z * [new tag] v2.0.0-rc2 -> v2.0.0-rc2 2025-08-14T21:20:15.6376320Z * [new tag] v2.0.0-rc3 -> v2.0.0-rc3 2025-08-14T21:20:15.6377893Z * [new tag] v2.0.0-rc4 -> v2.0.0-rc4 2025-08-14T21:20:15.6379203Z * [new tag] v2.0.0-rc5 -> v2.0.0-rc5 2025-08-14T21:20:15.6380284Z * [new tag] v2.0.0-rc6 -> v2.0.0-rc6 2025-08-14T21:20:15.6382018Z * [new tag] v2.0.1 -> v2.0.1 2025-08-14T21:20:15.6383466Z * [new tag] v2.0.1-rc1 -> v2.0.1-rc1 2025-08-14T21:20:15.6384451Z * [new tag] v2.0.1-rc2 -> v2.0.1-rc2 2025-08-14T21:20:15.6386018Z * [new tag] v2.0.1-rc3 -> v2.0.1-rc3 2025-08-14T21:20:15.6386989Z * [new tag] v2.0.1-rc4 -> v2.0.1-rc4 2025-08-14T21:20:15.6389154Z * [new tag] v2.1.0 -> v2.1.0 2025-08-14T21:20:15.6390303Z * [new tag] v2.1.0-rc1 -> v2.1.0-rc1 2025-08-14T21:20:15.6392074Z * [new tag] v2.1.0-rc2 -> v2.1.0-rc2 2025-08-14T21:20:15.6393493Z * [new tag] v2.1.0-rc3 -> v2.1.0-rc3 2025-08-14T21:20:15.6394797Z * [new tag] v2.1.0-rc4 -> v2.1.0-rc4 2025-08-14T21:20:15.6396412Z * [new tag] v2.1.0-rc5 -> v2.1.0-rc5 2025-08-14T21:20:15.6397402Z * [new tag] v2.1.0-rc6 -> v2.1.0-rc6 2025-08-14T21:20:15.6399080Z * [new tag] v2.1.1 -> v2.1.1 2025-08-14T21:20:15.6400528Z * [new tag] v2.1.1-rc1 -> v2.1.1-rc1 2025-08-14T21:20:15.6401671Z * [new tag] v2.1.1-rc2 -> v2.1.1-rc2 2025-08-14T21:20:15.6403425Z * [new tag] v2.1.1-rc3 -> v2.1.1-rc3 2025-08-14T21:20:15.6404980Z * [new tag] v2.1.1-rc4 -> v2.1.1-rc4 2025-08-14T21:20:15.6406532Z * [new tag] v2.1.1-rc5 -> v2.1.1-rc5 2025-08-14T21:20:15.6407485Z * [new tag] v2.1.1-rc6 -> v2.1.1-rc6 2025-08-14T21:20:15.6409107Z * [new tag] v2.1.2 -> v2.1.2 2025-08-14T21:20:15.6410579Z * [new tag] v2.1.2-rc1 -> v2.1.2-rc1 2025-08-14T21:20:15.6412094Z * [new tag] v2.1.2-rc2 -> v2.1.2-rc2 2025-08-14T21:20:15.6413084Z * [new tag] v2.1.2-rc3 -> v2.1.2-rc3 2025-08-14T21:20:15.6414802Z * [new tag] v2.2.0 -> v2.2.0 2025-08-14T21:20:15.6416248Z * [new tag] v2.2.0-rc1 -> v2.2.0-rc1 2025-08-14T21:20:15.6418034Z * [new tag] v2.2.0-rc2 -> v2.2.0-rc2 2025-08-14T21:20:15.6419384Z * [new tag] v2.2.0-rc3 -> v2.2.0-rc3 2025-08-14T21:20:15.6420693Z * [new tag] v2.2.0-rc4 -> v2.2.0-rc4 2025-08-14T21:20:15.6422173Z * [new tag] v2.2.0-rc5 -> v2.2.0-rc5 2025-08-14T21:20:15.6423649Z * [new tag] v2.2.0-rc6 -> v2.2.0-rc6 2025-08-14T21:20:15.6424627Z * [new tag] v2.2.0-rc7 -> v2.2.0-rc7 2025-08-14T21:20:15.6426150Z * [new tag] v2.2.0-rc8 -> v2.2.0-rc8 2025-08-14T21:20:15.6427598Z * [new tag] v2.2.1 -> v2.2.1 2025-08-14T21:20:15.6429071Z * [new tag] v2.2.1-rc1 -> v2.2.1-rc1 2025-08-14T21:20:15.6430086Z * [new tag] v2.2.1-rc2 -> v2.2.1-rc2 2025-08-14T21:20:15.6431526Z * [new tag] v2.2.1-rc3 -> v2.2.1-rc3 2025-08-14T21:20:15.6432502Z * [new tag] v2.2.2 -> v2.2.2 2025-08-14T21:20:15.6434238Z * [new tag] v2.2.2-rc1 -> v2.2.2-rc1 2025-08-14T21:20:15.6435234Z * [new tag] v2.2.2-rc2 -> v2.2.2-rc2 2025-08-14T21:20:15.6436550Z * [new tag] v2.2.2-rc3 -> v2.2.2-rc3 2025-08-14T21:20:15.6438074Z * [new tag] v2.3.0 -> v2.3.0 2025-08-14T21:20:15.6439217Z * [new tag] v2.3.0-rc1 -> v2.3.0-rc1 2025-08-14T21:20:15.6440872Z * [new tag] v2.3.0-rc10 -> v2.3.0-rc10 2025-08-14T21:20:15.6442357Z * [new tag] v2.3.0-rc11 -> v2.3.0-rc11 2025-08-14T21:20:15.6443334Z * [new tag] v2.3.0-rc12 -> v2.3.0-rc12 2025-08-14T21:20:15.6445028Z * [new tag] v2.3.0-rc2 -> v2.3.0-rc2 2025-08-14T21:20:15.6446535Z * [new tag] v2.3.0-rc3 -> v2.3.0-rc3 2025-08-14T21:20:15.6447813Z * [new tag] v2.3.0-rc4 -> v2.3.0-rc4 2025-08-14T21:20:15.6449686Z * [new tag] v2.3.0-rc5 -> v2.3.0-rc5 2025-08-14T21:20:15.6450597Z * [new tag] v2.3.0-rc6 -> v2.3.0-rc6 2025-08-14T21:20:15.6452250Z * [new tag] v2.3.0-rc7 -> v2.3.0-rc7 2025-08-14T21:20:15.6453559Z * [new tag] v2.3.0-rc8 -> v2.3.0-rc8 2025-08-14T21:20:15.6454623Z * [new tag] v2.3.0-rc9 -> v2.3.0-rc9 2025-08-14T21:20:15.6455973Z * [new tag] v2.3.1 -> v2.3.1 2025-08-14T21:20:15.6457507Z * [new tag] v2.3.1-rc1 -> v2.3.1-rc1 2025-08-14T21:20:15.6458850Z * [new tag] v2.3.1-rc2 -> v2.3.1-rc2 2025-08-14T21:20:15.6460341Z * [new tag] v2.3.1-rc3 -> v2.3.1-rc3 2025-08-14T21:20:15.6461792Z * [new tag] v2.4.0 -> v2.4.0 2025-08-14T21:20:15.6462956Z * [new tag] v2.4.0-rc1 -> v2.4.0-rc1 2025-08-14T21:20:15.6464589Z * [new tag] v2.4.0-rc2 -> v2.4.0-rc2 2025-08-14T21:20:15.6465748Z * [new tag] v2.4.0-rc3 -> v2.4.0-rc3 2025-08-14T21:20:15.6467371Z * [new tag] v2.4.0-rc4 -> v2.4.0-rc4 2025-08-14T21:20:15.6468843Z * [new tag] v2.4.0-rc5 -> v2.4.0-rc5 2025-08-14T21:20:15.6470292Z * [new tag] v2.4.0-rc6 -> v2.4.0-rc6 2025-08-14T21:20:15.6471769Z * [new tag] v2.4.0-rc7 -> v2.4.0-rc7 2025-08-14T21:20:15.6472907Z * [new tag] v2.4.0-rc8 -> v2.4.0-rc8 2025-08-14T21:20:15.6474668Z * [new tag] v2.4.0-rc9 -> v2.4.0-rc9 2025-08-14T21:20:15.6475652Z * [new tag] v2.4.1 -> v2.4.1 2025-08-14T21:20:15.6477404Z * [new tag] v2.4.1-rc1 -> v2.4.1-rc1 2025-08-14T21:20:15.6478827Z * [new tag] v2.4.1-rc2 -> v2.4.1-rc2 2025-08-14T21:20:15.6480298Z * [new tag] v2.4.1-rc3 -> v2.4.1-rc3 2025-08-14T21:20:15.6482160Z * [new tag] v2.5.0 -> v2.5.0 2025-08-14T21:20:15.6482871Z * [new tag] v2.5.0-rc1 -> v2.5.0-rc1 2025-08-14T21:20:15.6484194Z * [new tag] v2.5.0-rc10 -> v2.5.0-rc10 2025-08-14T21:20:15.6485998Z * [new tag] v2.5.0-rc2 -> v2.5.0-rc2 2025-08-14T21:20:15.6487132Z * [new tag] v2.5.0-rc3 -> v2.5.0-rc3 2025-08-14T21:20:15.6488752Z * [new tag] v2.5.0-rc4 -> v2.5.0-rc4 2025-08-14T21:20:15.6490318Z * [new tag] v2.5.0-rc5 -> v2.5.0-rc5 2025-08-14T21:20:15.6491638Z * [new tag] v2.5.0-rc6 -> v2.5.0-rc6 2025-08-14T21:20:15.6493323Z * [new tag] v2.5.0-rc7 -> v2.5.0-rc7 2025-08-14T21:20:15.6494463Z * [new tag] v2.5.0-rc8 -> v2.5.0-rc8 2025-08-14T21:20:15.6496231Z * [new tag] v2.5.0-rc9 -> v2.5.0-rc9 2025-08-14T21:20:15.6497272Z * [new tag] v2.5.1 -> v2.5.1 2025-08-14T21:20:15.6498910Z * [new tag] v2.5.1-rc1 -> v2.5.1-rc1 2025-08-14T21:20:15.6499665Z * [new tag] v2.6.0 -> v2.6.0 2025-08-14T21:20:15.6501376Z * [new tag] v2.6.0-rc1 -> v2.6.0-rc1 2025-08-14T21:20:15.6502923Z * [new tag] v2.6.0-rc2 -> v2.6.0-rc2 2025-08-14T21:20:15.6504391Z * [new tag] v2.6.0-rc3 -> v2.6.0-rc3 2025-08-14T21:20:15.6505489Z * [new tag] v2.6.0-rc4 -> v2.6.0-rc4 2025-08-14T21:20:15.6508228Z * [new tag] v2.6.0-rc5 -> v2.6.0-rc5 2025-08-14T21:20:15.6509572Z * [new tag] v2.6.0-rc6 -> v2.6.0-rc6 2025-08-14T21:20:15.6511120Z * [new tag] v2.6.0-rc7 -> v2.6.0-rc7 2025-08-14T21:20:15.6512622Z * [new tag] v2.6.0-rc8 -> v2.6.0-rc8 2025-08-14T21:20:15.6514108Z * [new tag] v2.6.0-rc9 -> v2.6.0-rc9 2025-08-14T21:20:15.6515673Z * [new tag] v2.7.0 -> v2.7.0 2025-08-14T21:20:15.6517156Z * [new tag] v2.7.0-rc1 -> v2.7.0-rc1 2025-08-14T21:20:15.6518122Z * [new tag] v2.7.0-rc10 -> v2.7.0-rc10 2025-08-14T21:20:15.6519831Z * [new tag] v2.7.0-rc2 -> v2.7.0-rc2 2025-08-14T21:20:15.6521391Z * [new tag] v2.7.0-rc3 -> v2.7.0-rc3 2025-08-14T21:20:15.6522858Z * [new tag] v2.7.0-rc4 -> v2.7.0-rc4 2025-08-14T21:20:15.6523997Z * [new tag] v2.7.0-rc5 -> v2.7.0-rc5 2025-08-14T21:20:15.6525867Z * [new tag] v2.7.0-rc6 -> v2.7.0-rc6 2025-08-14T21:20:15.6527192Z * [new tag] v2.7.0-rc7 -> v2.7.0-rc7 2025-08-14T21:20:15.6528905Z * [new tag] v2.7.0-rc8 -> v2.7.0-rc8 2025-08-14T21:20:15.6530363Z * [new tag] v2.7.0-rc9 -> v2.7.0-rc9 2025-08-14T21:20:15.6531360Z * [new tag] v2.7.1 -> v2.7.1 2025-08-14T21:20:15.6533139Z * [new tag] v2.7.1-rc1 -> v2.7.1-rc1 2025-08-14T21:20:15.6534611Z * [new tag] v2.7.1-rc2 -> v2.7.1-rc2 2025-08-14T21:20:15.6536084Z * [new tag] v2.7.1-rc3 -> v2.7.1-rc3 2025-08-14T21:20:15.6537648Z * [new tag] v2.7.1-rc4 -> v2.7.1-rc4 2025-08-14T21:20:15.6539160Z * [new tag] v2.7.1-rc5 -> v2.7.1-rc5 2025-08-14T21:20:15.6540228Z * [new tag] v2.8.0 -> v2.8.0 2025-08-14T21:20:15.6541868Z * [new tag] v2.8.0-rc1 -> v2.8.0-rc1 2025-08-14T21:20:15.6543403Z * [new tag] v2.8.0-rc2 -> v2.8.0-rc2 2025-08-14T21:20:15.6544936Z * [new tag] v2.8.0-rc3 -> v2.8.0-rc3 2025-08-14T21:20:15.6546483Z * [new tag] v2.8.0-rc4 -> v2.8.0-rc4 2025-08-14T21:20:15.6547759Z * [new tag] v2.8.0-rc5 -> v2.8.0-rc5 2025-08-14T21:20:15.6552059Z * [new tag] v2.8.0-rc6 -> v2.8.0-rc6 2025-08-14T21:20:15.6553375Z * [new tag] v2.8.0-rc7 -> v2.8.0-rc7 2025-08-14T21:20:15.6555052Z * [new tag] v2.8.0-rc8 -> v2.8.0-rc8 2025-08-14T21:20:15.6556178Z * [new tag] whc_flight_1 -> whc_flight_1 2025-08-14T21:20:15.6557521Z * [new tag] whc_flight_2 -> whc_flight_2 2025-08-14T21:20:15.6559051Z * [new tag] whc_flight_4 -> whc_flight_4 2025-08-14T21:20:15.7633778Z [command]/usr/bin/git rev-parse --verify --quiet 1fc683cf17c8c673044538d10266c00f92987be2^{object} 2025-08-14T21:20:15.7666918Z 1fc683cf17c8c673044538d10266c00f92987be2 2025-08-14T21:20:15.7673028Z ##[endgroup] 2025-08-14T21:20:15.7673432Z ##[group]Determining the checkout info 2025-08-14T21:20:15.7674654Z ##[endgroup] 2025-08-14T21:20:15.7679671Z [command]/usr/bin/git sparse-checkout disable 2025-08-14T21:20:15.7737686Z [command]/usr/bin/git config --local --unset-all extensions.worktreeConfig 2025-08-14T21:20:15.7772008Z ##[group]Checking out the ref 2025-08-14T21:20:15.7775717Z [command]/usr/bin/git checkout --progress --force 1fc683cf17c8c673044538d10266c00f92987be2 2025-08-14T21:20:16.8129682Z Updating files: 71% (13828/19474) 2025-08-14T21:20:16.8199484Z Updating files: 72% (14022/19474) 2025-08-14T21:20:16.8385661Z Updating files: 73% (14217/19474) 2025-08-14T21:20:16.8687035Z Updating files: 74% (14411/19474) 2025-08-14T21:20:16.9073183Z Updating files: 75% (14606/19474) 2025-08-14T21:20:16.9360717Z Updating files: 76% (14801/19474) 2025-08-14T21:20:16.9520374Z Updating files: 77% (14995/19474) 2025-08-14T21:20:16.9679002Z Updating files: 78% (15190/19474) 2025-08-14T21:20:16.9985418Z Updating files: 79% (15385/19474) 2025-08-14T21:20:17.0298024Z Updating files: 80% (15580/19474) 2025-08-14T21:20:17.0618103Z Updating files: 81% (15774/19474) 2025-08-14T21:20:17.0862943Z Updating files: 82% (15969/19474) 2025-08-14T21:20:17.1046452Z Updating files: 83% (16164/19474) 2025-08-14T21:20:17.1208588Z Updating files: 84% (16359/19474) 2025-08-14T21:20:17.1398563Z Updating files: 85% (16553/19474) 2025-08-14T21:20:17.1578692Z Updating files: 86% (16748/19474) 2025-08-14T21:20:17.1744168Z Updating files: 87% (16943/19474) 2025-08-14T21:20:17.1897165Z Updating files: 88% (17138/19474) 2025-08-14T21:20:17.2056163Z Updating files: 89% (17332/19474) 2025-08-14T21:20:17.2257385Z Updating files: 90% (17527/19474) 2025-08-14T21:20:17.2424755Z Updating files: 91% (17722/19474) 2025-08-14T21:20:17.2587988Z Updating files: 92% (17917/19474) 2025-08-14T21:20:17.2809865Z Updating files: 93% (18111/19474) 2025-08-14T21:20:17.3031659Z Updating files: 94% (18306/19474) 2025-08-14T21:20:17.3251967Z Updating files: 95% (18501/19474) 2025-08-14T21:20:17.3449869Z Updating files: 96% (18696/19474) 2025-08-14T21:20:17.3654978Z Updating files: 97% (18890/19474) 2025-08-14T21:20:17.3968630Z Updating files: 98% (19085/19474) 2025-08-14T21:20:17.4158913Z Updating files: 99% (19280/19474) 2025-08-14T21:20:17.4159271Z Updating files: 100% (19474/19474) 2025-08-14T21:20:17.4159566Z Updating files: 100% (19474/19474), done. 2025-08-14T21:20:17.4436013Z Note: switching to '1fc683cf17c8c673044538d10266c00f92987be2'. 2025-08-14T21:20:17.4436840Z 2025-08-14T21:20:17.4437249Z You are in 'detached HEAD' state. You can look around, make experimental 2025-08-14T21:20:17.4438243Z changes and commit them, and you can discard any commits you make in this 2025-08-14T21:20:17.4439241Z state without impacting any branches by switching back to a branch. 2025-08-14T21:20:17.4439822Z 2025-08-14T21:20:17.4440210Z If you want to create a new branch to retain commits you create, you may 2025-08-14T21:20:17.4441132Z do so (now or later) by using -c with the switch command. Example: 2025-08-14T21:20:17.4441656Z 2025-08-14T21:20:17.4441892Z git switch -c 2025-08-14T21:20:17.4442264Z 2025-08-14T21:20:17.4442482Z Or undo this operation with: 2025-08-14T21:20:17.4442820Z 2025-08-14T21:20:17.4442998Z git switch - 2025-08-14T21:20:17.4443262Z 2025-08-14T21:20:17.4443710Z Turn off this advice by setting config variable advice.detachedHead to false 2025-08-14T21:20:17.4444836Z 2025-08-14T21:20:17.4445572Z HEAD is now at 1fc683cf17c [Inductor] Allow indexing a flexible layout for extract_input_node_reduction_ranges (#160645) 2025-08-14T21:20:17.4609598Z ##[endgroup] 2025-08-14T21:20:17.4610015Z ##[group]Setting up auth for fetching submodules 2025-08-14T21:20:17.4615947Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-08-14T21:20:17.4677032Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2025-08-14T21:20:17.4712262Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2025-08-14T21:20:17.4747789Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2025-08-14T21:20:17.4779724Z ##[endgroup] 2025-08-14T21:20:17.4780104Z ##[group]Fetching submodules 2025-08-14T21:20:17.4783183Z [command]/usr/bin/git submodule sync --recursive 2025-08-14T21:20:17.5206155Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2025-08-14T21:20:17.5968513Z Submodule 'android/libs/fbjni' (https://github.com/facebookincubator/fbjni.git) registered for path 'android/libs/fbjni' 2025-08-14T21:20:17.5971773Z Submodule 'third_party/NNPACK_deps/FP16' (https://github.com/Maratyszcza/FP16.git) registered for path 'third_party/FP16' 2025-08-14T21:20:17.5976177Z Submodule 'third_party/NNPACK_deps/FXdiv' (https://github.com/Maratyszcza/FXdiv.git) registered for path 'third_party/FXdiv' 2025-08-14T21:20:17.5980673Z Submodule 'third_party/NNPACK' (https://github.com/Maratyszcza/NNPACK.git) registered for path 'third_party/NNPACK' 2025-08-14T21:20:17.5985345Z Submodule 'third_party/NVTX' (https://github.com/NVIDIA/NVTX.git) registered for path 'third_party/NVTX' 2025-08-14T21:20:17.6002157Z Submodule 'third_party/VulkanMemoryAllocator' (https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator.git) registered for path 'third_party/VulkanMemoryAllocator' 2025-08-14T21:20:17.6006907Z Submodule 'third_party/XNNPACK' (https://github.com/google/XNNPACK.git) registered for path 'third_party/XNNPACK' 2025-08-14T21:20:17.6011694Z Submodule 'third_party/aiter' (https://github.com/ROCm/aiter.git) registered for path 'third_party/aiter' 2025-08-14T21:20:17.6016784Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/benchmark' 2025-08-14T21:20:17.6033197Z Submodule 'third_party/composable_kernel' (https://github.com/ROCm/composable_kernel.git) registered for path 'third_party/composable_kernel' 2025-08-14T21:20:17.6038298Z Submodule 'third_party/cpp-httplib' (https://github.com/yhirose/cpp-httplib.git) registered for path 'third_party/cpp-httplib' 2025-08-14T21:20:17.6043882Z Submodule 'third_party/cpuinfo' (https://github.com/pytorch/cpuinfo.git) registered for path 'third_party/cpuinfo' 2025-08-14T21:20:17.6050001Z Submodule 'third_party/cudnn_frontend' (https://github.com/NVIDIA/cudnn-frontend.git) registered for path 'third_party/cudnn_frontend' 2025-08-14T21:20:17.6055416Z Submodule 'third_party/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'third_party/cutlass' 2025-08-14T21:20:17.6071765Z Submodule 'third_party/fbgemm' (https://github.com/pytorch/fbgemm) registered for path 'third_party/fbgemm' 2025-08-14T21:20:17.6078196Z Submodule 'third_party/flash-attention' (https://github.com/Dao-AILab/flash-attention.git) registered for path 'third_party/flash-attention' 2025-08-14T21:20:17.6083757Z Submodule 'third_party/flatbuffers' (https://github.com/google/flatbuffers.git) registered for path 'third_party/flatbuffers' 2025-08-14T21:20:17.6093082Z Submodule 'third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/fmt' 2025-08-14T21:20:17.6110046Z Submodule 'third_party/gemmlowp/gemmlowp' (https://github.com/google/gemmlowp.git) registered for path 'third_party/gemmlowp/gemmlowp' 2025-08-14T21:20:17.6115736Z Submodule 'third_party/gloo' (https://github.com/pytorch/gloo) registered for path 'third_party/gloo' 2025-08-14T21:20:17.6121682Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/googletest' 2025-08-14T21:20:17.6127814Z Submodule 'third_party/ideep' (https://github.com/intel/ideep) registered for path 'third_party/ideep' 2025-08-14T21:20:17.6133890Z Submodule 'third_party/ittapi' (https://github.com/intel/ittapi.git) registered for path 'third_party/ittapi' 2025-08-14T21:20:17.6151634Z Submodule 'third_party/kineto' (https://github.com/pytorch/kineto) registered for path 'third_party/kineto' 2025-08-14T21:20:17.6159031Z Submodule 'third_party/kleidiai' (https://github.com/ARM-software/kleidiai.git) registered for path 'third_party/kleidiai' 2025-08-14T21:20:17.6165236Z Submodule 'third_party/mimalloc' (https://github.com/microsoft/mimalloc.git) registered for path 'third_party/mimalloc' 2025-08-14T21:20:17.6171811Z Submodule 'third_party/nlohmann' (https://github.com/nlohmann/json.git) registered for path 'third_party/nlohmann' 2025-08-14T21:20:17.6189290Z Submodule 'third_party/onnx' (https://github.com/onnx/onnx.git) registered for path 'third_party/onnx' 2025-08-14T21:20:17.6195997Z Submodule 'third_party/opentelemetry-cpp' (https://github.com/open-telemetry/opentelemetry-cpp.git) registered for path 'third_party/opentelemetry-cpp' 2025-08-14T21:20:17.6202225Z Submodule 'third_party/pocketfft' (https://github.com/mreineck/pocketfft) registered for path 'third_party/pocketfft' 2025-08-14T21:20:17.6209066Z Submodule 'third_party/protobuf' (https://github.com/protocolbuffers/protobuf.git) registered for path 'third_party/protobuf' 2025-08-14T21:20:17.6215886Z Submodule 'third_party/NNPACK_deps/psimd' (https://github.com/Maratyszcza/psimd.git) registered for path 'third_party/psimd' 2025-08-14T21:20:17.6234134Z Submodule 'third_party/NNPACK_deps/pthreadpool' (https://github.com/Maratyszcza/pthreadpool.git) registered for path 'third_party/pthreadpool' 2025-08-14T21:20:17.6240927Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/pybind11' 2025-08-14T21:20:17.6252200Z Submodule 'third_party/python-peachpy' (https://github.com/malfet/PeachPy.git) registered for path 'third_party/python-peachpy' 2025-08-14T21:20:17.6259176Z Submodule 'third_party/sleef' (https://github.com/shibatch/sleef) registered for path 'third_party/sleef' 2025-08-14T21:20:17.6276680Z Submodule 'third_party/tensorpipe' (https://github.com/pytorch/tensorpipe.git) registered for path 'third_party/tensorpipe' 2025-08-14T21:20:17.6317863Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/android/libs/fbjni'... 2025-08-14T21:20:17.8901248Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/FXdiv'... 2025-08-14T21:20:17.8902078Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/FP16'... 2025-08-14T21:20:17.8944719Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fmt'... 2025-08-14T21:20:20.8361738Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/NNPACK'... 2025-08-14T21:20:20.8363240Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/benchmark'... 2025-08-14T21:20:20.8365428Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/NVTX'... 2025-08-14T21:20:20.8366728Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/gloo'... 2025-08-14T21:20:20.8367985Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/flash-attention'... 2025-08-14T21:20:20.8369426Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cpp-httplib'... 2025-08-14T21:20:20.8371304Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/gemmlowp/gemmlowp'... 2025-08-14T21:20:20.8372461Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cpuinfo'... 2025-08-14T21:20:20.8373574Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep'... 2025-08-14T21:20:20.8374982Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ittapi'... 2025-08-14T21:20:20.8376042Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kleidiai'... 2025-08-14T21:20:20.8377095Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pocketfft'... 2025-08-14T21:20:20.8378282Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cudnn_frontend'... 2025-08-14T21:20:20.8379363Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/psimd'... 2025-08-14T21:20:20.8380477Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/mimalloc'... 2025-08-14T21:20:20.8381597Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/googletest'... 2025-08-14T21:20:20.8382775Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pthreadpool'... 2025-08-14T21:20:20.8384250Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/VulkanMemoryAllocator'... 2025-08-14T21:20:20.9096577Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/flatbuffers'... 2025-08-14T21:20:21.0098606Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp'... 2025-08-14T21:20:32.0909993Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/python-peachpy'... 2025-08-14T21:20:32.0911260Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe'... 2025-08-14T21:20:32.0912391Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto'... 2025-08-14T21:20:32.0913483Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/sleef'... 2025-08-14T21:20:32.0914581Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/cutlass'... 2025-08-14T21:20:32.0915706Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/pybind11'... 2025-08-14T21:20:32.0916859Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm'... 2025-08-14T21:20:32.0918038Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/composable_kernel'... 2025-08-14T21:20:32.0919177Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx'... 2025-08-14T21:20:32.0920200Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/aiter'... 2025-08-14T21:20:32.0921305Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/nlohmann'... 2025-08-14T21:20:32.1911046Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/XNNPACK'... 2025-08-14T21:20:37.8470708Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf'... 2025-08-14T21:20:37.8686211Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2025-08-14T21:20:37.8860134Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2025-08-14T21:20:37.9000660Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2025-08-14T21:20:37.9349458Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2025-08-14T21:20:38.0323505Z Submodule path 'third_party/NVTX': checked out '2942f167cc30c5e3a44a2aecd5b0d9c07ff61a07' 2025-08-14T21:20:38.0983333Z Submodule path 'third_party/VulkanMemoryAllocator': checked out '1d8f600fd424278486eade7ed3e877c99f0846b1' 2025-08-14T21:20:39.0734199Z Submodule path 'third_party/XNNPACK': checked out '51a0103656eff6fc9bfd39a4597923c4b542c883' 2025-08-14T21:20:39.2598303Z Submodule path 'third_party/aiter': checked out '01aae101b9e5e94d6c16a9514c9fb8df99c93150' 2025-08-14T21:20:39.2628686Z Submodule '3rdparty/composable_kernel' (https://github.com/ROCm/composable_kernel.git) registered for path 'third_party/aiter/3rdparty/composable_kernel' 2025-08-14T21:20:39.2663134Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/aiter/3rdparty/composable_kernel'... 2025-08-14T21:20:42.7189385Z Submodule path 'third_party/aiter/3rdparty/composable_kernel': checked out 'cffe8fa2a442ac8e80dd236a1a5d24fe3d7e0cbf' 2025-08-14T21:20:42.7507524Z Submodule path 'third_party/benchmark': checked out '299e5928955cc62af9968370293b916f5130916f' 2025-08-14T21:20:43.1813809Z Submodule path 'third_party/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-08-14T21:20:43.2479162Z Submodule path 'third_party/cpp-httplib': checked out '3af7f2c16147f3fbc6e4d717032daf505dc1652c' 2025-08-14T21:20:43.3656690Z Submodule path 'third_party/cpuinfo': checked out '5e3d2445e6a84d9599bee2bf78edbb4d80865e1d' 2025-08-14T21:20:43.4201222Z Submodule path 'third_party/cudnn_frontend': checked out 'f937055efc6d414d11f4c6577e3977fe74f35fb6' 2025-08-14T21:20:44.2172643Z Submodule path 'third_party/cutlass': checked out 'e51efbfe18fe4f4cbb66ab814c55bf4aa0185491' 2025-08-14T21:20:44.3989973Z Submodule path 'third_party/fbgemm': checked out '21c7d30c526c0f1ad873ecc632dca6cfa8a69067' 2025-08-14T21:20:44.4022227Z Submodule 'external/asmjit' (https://github.com/asmjit/asmjit.git) registered for path 'third_party/fbgemm/external/asmjit' 2025-08-14T21:20:44.4025229Z Submodule 'external/composable_kernel' (https://github.com/jwfromm/composable_kernel.git) registered for path 'third_party/fbgemm/external/composable_kernel' 2025-08-14T21:20:44.4029078Z Submodule 'external/cpuinfo' (https://github.com/pytorch/cpuinfo) registered for path 'third_party/fbgemm/external/cpuinfo' 2025-08-14T21:20:44.4033134Z Submodule 'external/cutlass' (https://github.com/jwfromm/cutlass) registered for path 'third_party/fbgemm/external/cutlass' 2025-08-14T21:20:44.4037331Z Submodule 'external/googletest' (https://github.com/google/googletest) registered for path 'third_party/fbgemm/external/googletest' 2025-08-14T21:20:44.4041600Z Submodule 'external/hipify_torch' (https://github.com/ROCmSoftwarePlatform/hipify_torch.git) registered for path 'third_party/fbgemm/external/hipify_torch' 2025-08-14T21:20:44.4045778Z Submodule 'external/json' (https://github.com/nlohmann/json.git) registered for path 'third_party/fbgemm/external/json' 2025-08-14T21:20:44.4084964Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/external/asmjit'... 2025-08-14T21:20:45.7560791Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/external/hipify_torch'... 2025-08-14T21:20:45.7562031Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/external/cpuinfo'... 2025-08-14T21:20:45.7563164Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/external/googletest'... 2025-08-14T21:20:45.8562062Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/external/composable_kernel'... 2025-08-14T21:20:46.0268754Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/external/cutlass'... 2025-08-14T21:20:46.9877738Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/fbgemm/external/json'... 2025-08-14T21:20:51.9979405Z Submodule path 'third_party/fbgemm/external/asmjit': checked out 'a3199e8857792cd10b7589ff5d58343d2c9008ea' 2025-08-14T21:20:52.3577462Z Submodule path 'third_party/fbgemm/external/composable_kernel': checked out 'b1281b8b08d973a7064f864f47eeb30f3e2596e9' 2025-08-14T21:20:52.4782012Z Submodule path 'third_party/fbgemm/external/cpuinfo': checked out '6543fec09b2f04ac4a666882998b534afc9c1349' 2025-08-14T21:20:53.2590576Z Submodule path 'third_party/fbgemm/external/cutlass': checked out 'b40777404c174b9694a870bff5c13ce6b7f656ad' 2025-08-14T21:20:53.3160108Z Submodule path 'third_party/fbgemm/external/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-08-14T21:20:53.3325746Z Submodule path 'third_party/fbgemm/external/hipify_torch': checked out 'a4337c69fe0e2552a7b7b0669178926beeed828c' 2025-08-14T21:20:53.4665719Z Submodule path 'third_party/fbgemm/external/json': checked out '9cca280a4d0ccf0c08f47a99aa71d1b0e52f8d03' 2025-08-14T21:20:53.5610856Z Submodule path 'third_party/flash-attention': checked out '979702c87a8713a8e0a5e9fee122b90d2ef13be5' 2025-08-14T21:20:53.5640365Z Submodule 'csrc/composable_kernel' (https://github.com/ROCm/composable_kernel.git) registered for path 'third_party/flash-attention/csrc/composable_kernel' 2025-08-14T21:20:53.5642980Z Submodule 'csrc/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'third_party/flash-attention/csrc/cutlass' 2025-08-14T21:20:53.5679557Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/flash-attention/csrc/composable_kernel'... 2025-08-14T21:20:56.6772435Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/flash-attention/csrc/cutlass'... 2025-08-14T21:20:57.0066648Z Submodule path 'third_party/flash-attention/csrc/composable_kernel': checked out '888317e698e9803c62bd38568abc9e05d7709f33' 2025-08-14T21:20:57.7292732Z Submodule path 'third_party/flash-attention/csrc/cutlass': checked out 'c506e16788cb08416a4a57e11a9067beeee29420' 2025-08-14T21:20:57.9154414Z Submodule path 'third_party/flatbuffers': checked out 'a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757' 2025-08-14T21:20:57.9570848Z Submodule path 'third_party/fmt': checked out '40626af88bd7df9a5fb80be7b25ac85b122d6c21' 2025-08-14T21:20:58.0054508Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2025-08-14T21:20:58.0404654Z Submodule path 'third_party/gloo': checked out 'c7b7b022c124d9643957d9bd55f57ac59fce8fa2' 2025-08-14T21:20:58.0962232Z Submodule path 'third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-08-14T21:20:58.1146909Z Submodule path 'third_party/ideep': checked out '719d8e6cd7f7a0e01b155657526d693acf97c2b3' 2025-08-14T21:20:58.1172542Z Submodule 'mkl-dnn' (https://github.com/intel/mkl-dnn.git) registered for path 'third_party/ideep/mkl-dnn' 2025-08-14T21:20:58.1207785Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/ideep/mkl-dnn'... 2025-08-14T21:21:10.4728208Z Submodule path 'third_party/ideep/mkl-dnn': checked out '8d263e693366ef8db40acc569cc7d8edf644556d' 2025-08-14T21:21:10.5004595Z Submodule path 'third_party/ittapi': checked out 'dec1d23ca65ab069d225dfe40dea14f455170959' 2025-08-14T21:21:10.6048634Z Submodule path 'third_party/kineto': checked out '5e7501833f1021ce6f618572d3baf657b6319658' 2025-08-14T21:21:10.6075646Z Submodule 'libkineto/third_party/dynolog' (https://github.com/facebookincubator/dynolog.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog' 2025-08-14T21:21:10.6078332Z Submodule 'libkineto/third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/kineto/libkineto/third_party/fmt' 2025-08-14T21:21:10.6082420Z Submodule 'libkineto/third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/kineto/libkineto/third_party/googletest' 2025-08-14T21:21:10.6118612Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog'... 2025-08-14T21:21:11.4034624Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/fmt'... 2025-08-14T21:21:11.9772007Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/googletest'... 2025-08-14T21:21:12.0755109Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog': checked out '7d04a0053a845370ae06ce317a22a48e9edcc74e' 2025-08-14T21:21:12.0778900Z Submodule 'third_party/DCGM' (https://github.com/NVIDIA/DCGM.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-08-14T21:21:12.0782629Z Submodule 'third_party/cpr' (https://github.com/libcpr/cpr.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-08-14T21:21:12.0786708Z Submodule 'third_party/fmt' (https://github.com/fmtlib/fmt.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-08-14T21:21:12.0790828Z Submodule 'third_party/gflags' (https://github.com/gflags/gflags.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-08-14T21:21:12.0794955Z Submodule 'third_party/glog' (https://github.com/google/glog.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-08-14T21:21:12.0799380Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-08-14T21:21:12.0804330Z Submodule 'third_party/json' (https://github.com/nlohmann/json.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-08-14T21:21:12.0808995Z Submodule 'third_party/pfs' (https://github.com/dtrugman/pfs.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-08-14T21:21:12.0845403Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'... 2025-08-14T21:21:13.4925547Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'... 2025-08-14T21:21:13.4927887Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'... 2025-08-14T21:21:13.4929473Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'... 2025-08-14T21:21:13.4931091Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/glog'... 2025-08-14T21:21:13.4932263Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'... 2025-08-14T21:21:13.5926873Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'... 2025-08-14T21:21:13.6941640Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/json'... 2025-08-14T21:21:19.8445878Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM': checked out 'ffde4e54bc7249a6039a5e6b45b395141e1217f9' 2025-08-14T21:21:19.8707453Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr': checked out '871ed52d350214a034f6ef8a3b8f51c5ce1bd400' 2025-08-14T21:21:19.9172128Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05' 2025-08-14T21:21:19.9364100Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags': checked out 'e171aa2d15ed9eb17054558e0b3a6a413bb01067' 2025-08-14T21:21:19.9385513Z Submodule 'doc' (https://github.com/gflags/gflags.git) registered for path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-08-14T21:21:19.9426425Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'... 2025-08-14T21:21:20.2288950Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc': checked out '8411df715cf522606e3b1aca386ddfc0b63d34b4' 2025-08-14T21:21:20.2546017Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog': checked out 'b33e3bad4c46c8a6345525fd822af355e5ef9446' 2025-08-14T21:21:20.3061192Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest': checked out '58d77fa8070e8cec2dc1ed015d66b454c8d78850' 2025-08-14T21:21:20.4341780Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json': checked out '4f8fba14066156b73f1189a2b8bd568bde5284c5' 2025-08-14T21:21:20.4569980Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs': checked out 'f68a2fa8ea36c783bdd760371411fcb495aa3150' 2025-08-14T21:21:20.5047486Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '0041a40c1350ba702d475b9c4ad62da77caea164' 2025-08-14T21:21:20.5754733Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '7aca84427f224eeed3144123d5230d5871e93347' 2025-08-14T21:21:20.6329648Z Submodule path 'third_party/kleidiai': checked out 'cca02c2f69dd18e1f12647c1c0bdc8cf90e680c7' 2025-08-14T21:21:20.6846765Z Submodule path 'third_party/mimalloc': checked out 'fbd8b99c2b828428947d70fdc046bb55609be93e' 2025-08-14T21:21:20.8281248Z Submodule path 'third_party/nlohmann': checked out '55f93686c01528224f448c19128836e7df245f72' 2025-08-14T21:21:21.4435364Z Submodule path 'third_party/onnx': checked out 'e709452ef2bbc1d113faf678c24e6d3467696e83' 2025-08-14T21:21:21.4481527Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/onnx/third_party/pybind11' 2025-08-14T21:21:21.4519769Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/onnx/third_party/pybind11'... 2025-08-14T21:21:22.5385158Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'a2e59f0e7065404b44dfe92a28aca47ba1378dc4' 2025-08-14T21:21:22.6375888Z Submodule path 'third_party/opentelemetry-cpp': checked out 'a799f4aed9c94b765dcdaabaeab7d5e7e2310878' 2025-08-14T21:21:22.6403883Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark) registered for path 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-08-14T21:21:22.6407290Z Submodule 'third_party/googletest' (https://github.com/google/googletest) registered for path 'third_party/opentelemetry-cpp/third_party/googletest' 2025-08-14T21:21:22.6411052Z Submodule 'third_party/ms-gsl' (https://github.com/microsoft/GSL) registered for path 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-08-14T21:21:22.6415360Z Submodule 'third_party/nlohmann-json' (https://github.com/nlohmann/json) registered for path 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-08-14T21:21:22.6419872Z Submodule 'third_party/opentelemetry-proto' (https://github.com/open-telemetry/opentelemetry-proto) registered for path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-08-14T21:21:22.6424989Z Submodule 'third_party/opentracing-cpp' (https://github.com/opentracing/opentracing-cpp.git) registered for path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-08-14T21:21:22.6428863Z Submodule 'third_party/prometheus-cpp' (https://github.com/jupp0r/prometheus-cpp) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-08-14T21:21:22.6433497Z Submodule 'tools/vcpkg' (https://github.com/Microsoft/vcpkg) registered for path 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-08-14T21:21:22.6473843Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/benchmark'... 2025-08-14T21:21:23.0878380Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/opentracing-cpp'... 2025-08-14T21:21:23.0879423Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/opentelemetry-proto'... 2025-08-14T21:21:23.0880484Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/ms-gsl'... 2025-08-14T21:21:23.0881700Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp'... 2025-08-14T21:21:23.1879668Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/googletest'... 2025-08-14T21:21:23.8370211Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/nlohmann-json'... 2025-08-14T21:21:31.8949260Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/tools/vcpkg'... 2025-08-14T21:21:32.1102536Z Submodule path 'third_party/opentelemetry-cpp/third_party/benchmark': checked out 'd572f4777349d43653b21d6c2fc63020ab326db2' 2025-08-14T21:21:32.1609515Z Submodule path 'third_party/opentelemetry-cpp/third_party/googletest': checked out 'b796f7d44681514f58a683a3a71ff17c94edb0c1' 2025-08-14T21:21:32.1830098Z Submodule path 'third_party/opentelemetry-cpp/third_party/ms-gsl': checked out '6f4529395c5b7c2d661812257cd6780c67e54afa' 2025-08-14T21:21:32.3181655Z Submodule path 'third_party/opentelemetry-cpp/third_party/nlohmann-json': checked out 'bc889afb4c5bf1c0d8ee29ef35eaaf4c8bef8a5d' 2025-08-14T21:21:32.3373444Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto': checked out '4ca4f0335c63cda7ab31ea7ed70d6553aee14dce' 2025-08-14T21:21:32.3584788Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp': checked out '06b57f48ded1fa3bdd3d4346f6ef29e40e08eaf5' 2025-08-14T21:21:32.3813230Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp': checked out 'c9ffcdda9086ffd9e1283ea7a0276d831f3c8a8d' 2025-08-14T21:21:32.3837126Z Submodule 'civetweb' (https://github.com/civetweb/civetweb.git) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-08-14T21:21:32.3839900Z Submodule 'googletest' (https://github.com/google/googletest.git) registered for path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-08-14T21:21:32.3875064Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'... 2025-08-14T21:21:34.2900058Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'... 2025-08-14T21:21:34.5810454Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'eefb26f82b233268fc98577d265352720d477ba4' 2025-08-14T21:21:34.6388833Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-08-14T21:21:35.3605311Z Submodule path 'third_party/opentelemetry-cpp/tools/vcpkg': checked out '8eb57355a4ffb410a2e94c07b4dca2dffbee8e50' 2025-08-14T21:21:35.3771400Z Submodule path 'third_party/pocketfft': checked out '0fa0ef591e38c2758e3184c6c23e497b9f732ffa' 2025-08-14T21:21:35.7174776Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2025-08-14T21:21:35.7207038Z Submodule 'third_party/benchmark' (https://github.com/google/benchmark.git) registered for path 'third_party/protobuf/third_party/benchmark' 2025-08-14T21:21:35.7209177Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/protobuf/third_party/googletest' 2025-08-14T21:21:35.7245512Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf/third_party/benchmark'... 2025-08-14T21:21:36.2873231Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/protobuf/third_party/googletest'... 2025-08-14T21:21:36.7885278Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2025-08-14T21:21:36.8751659Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2025-08-14T21:21:36.8892699Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2025-08-14T21:21:36.9065981Z Submodule path 'third_party/pthreadpool': checked out '4fe0e1e183925bf8cfa6aae24237e724a96479b8' 2025-08-14T21:21:36.9545324Z Submodule path 'third_party/pybind11': checked out 'a2e59f0e7065404b44dfe92a28aca47ba1378dc4' 2025-08-14T21:21:36.9917846Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2025-08-14T21:21:37.0456067Z Submodule path 'third_party/sleef': checked out '5a1d179df9cf652951b59010a2d2075372d67f68' 2025-08-14T21:21:37.0831161Z Submodule path 'third_party/tensorpipe': checked out 'dacda0567d9f23d4bc503e1c4f84aa65f33ac38a' 2025-08-14T21:21:37.0861883Z Submodule 'third_party/googletest' (https://github.com/google/googletest.git) registered for path 'third_party/tensorpipe/third_party/googletest' 2025-08-14T21:21:37.0864744Z Submodule 'third_party/libnop' (https://github.com/google/libnop.git) registered for path 'third_party/tensorpipe/third_party/libnop' 2025-08-14T21:21:37.0868614Z Submodule 'third_party/libuv' (https://github.com/libuv/libuv.git) registered for path 'third_party/tensorpipe/third_party/libuv' 2025-08-14T21:21:37.0872751Z Submodule 'third_party/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'third_party/tensorpipe/third_party/pybind11' 2025-08-14T21:21:37.0908091Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/googletest'... 2025-08-14T21:21:38.2348179Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libnop'... 2025-08-14T21:21:38.2349043Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11'... 2025-08-14T21:21:38.3349901Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/libuv'... 2025-08-14T21:21:38.4956936Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2025-08-14T21:21:38.5177985Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2025-08-14T21:21:38.6083720Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '5152db2cbfeb5582e9c27c5ea1dba2cd9e10759b' 2025-08-14T21:21:38.6463174Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2025-08-14T21:21:38.6485301Z Submodule 'tools/clang' (https://github.com/wjakob/clang-cindex-python3) registered for path 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-08-14T21:21:38.6523050Z Cloning into '/home/ec2-user/actions-runner/_work/pytorch/pytorch/third_party/tensorpipe/third_party/pybind11/tools/clang'... 2025-08-14T21:21:38.8601256Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2025-08-14T21:21:38.8651112Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2025-08-14T21:21:38.9058288Z Entering 'android/libs/fbjni' 2025-08-14T21:21:38.9121947Z Entering 'third_party/FP16' 2025-08-14T21:21:38.9182455Z Entering 'third_party/FXdiv' 2025-08-14T21:21:38.9241741Z Entering 'third_party/NNPACK' 2025-08-14T21:21:38.9302680Z Entering 'third_party/NVTX' 2025-08-14T21:21:38.9363937Z Entering 'third_party/VulkanMemoryAllocator' 2025-08-14T21:21:38.9423784Z Entering 'third_party/XNNPACK' 2025-08-14T21:21:38.9499288Z Entering 'third_party/aiter' 2025-08-14T21:21:38.9558552Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-08-14T21:21:38.9628480Z Entering 'third_party/benchmark' 2025-08-14T21:21:38.9688458Z Entering 'third_party/composable_kernel' 2025-08-14T21:21:38.9757695Z Entering 'third_party/cpp-httplib' 2025-08-14T21:21:38.9816856Z Entering 'third_party/cpuinfo' 2025-08-14T21:21:38.9876789Z Entering 'third_party/cudnn_frontend' 2025-08-14T21:21:38.9936592Z Entering 'third_party/cutlass' 2025-08-14T21:21:39.0008367Z Entering 'third_party/fbgemm' 2025-08-14T21:21:39.0073557Z Entering 'third_party/fbgemm/external/asmjit' 2025-08-14T21:21:39.0130292Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-08-14T21:21:39.0196029Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-08-14T21:21:39.0252803Z Entering 'third_party/fbgemm/external/cutlass' 2025-08-14T21:21:39.0319768Z Entering 'third_party/fbgemm/external/googletest' 2025-08-14T21:21:39.0377876Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-08-14T21:21:39.0434582Z Entering 'third_party/fbgemm/external/json' 2025-08-14T21:21:39.0498010Z Entering 'third_party/flash-attention' 2025-08-14T21:21:39.0558519Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-08-14T21:21:39.0623824Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-08-14T21:21:39.0693571Z Entering 'third_party/flatbuffers' 2025-08-14T21:21:39.0757446Z Entering 'third_party/fmt' 2025-08-14T21:21:39.0817388Z Entering 'third_party/gemmlowp/gemmlowp' 2025-08-14T21:21:39.0877924Z Entering 'third_party/gloo' 2025-08-14T21:21:39.0942654Z Entering 'third_party/googletest' 2025-08-14T21:21:39.1001954Z Entering 'third_party/ideep' 2025-08-14T21:21:39.1060379Z Entering 'third_party/ideep/mkl-dnn' 2025-08-14T21:21:39.1128227Z Entering 'third_party/ittapi' 2025-08-14T21:21:39.1189796Z Entering 'third_party/kineto' 2025-08-14T21:21:39.1249830Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-08-14T21:21:39.1307368Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-08-14T21:21:39.1366919Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-08-14T21:21:39.1430360Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-08-14T21:21:39.1494878Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-08-14T21:21:39.1553373Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-08-14T21:21:39.1615999Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-08-14T21:21:39.1676680Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-08-14T21:21:39.1739063Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-08-14T21:21:39.1800098Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-08-14T21:21:39.1862729Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-08-14T21:21:39.1919864Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-08-14T21:21:39.1981644Z Entering 'third_party/kleidiai' 2025-08-14T21:21:39.2043987Z Entering 'third_party/mimalloc' 2025-08-14T21:21:39.2104819Z Entering 'third_party/nlohmann' 2025-08-14T21:21:39.2165677Z Entering 'third_party/onnx' 2025-08-14T21:21:39.2242700Z Entering 'third_party/onnx/third_party/pybind11' 2025-08-14T21:21:39.2312709Z Entering 'third_party/opentelemetry-cpp' 2025-08-14T21:21:39.2374756Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-08-14T21:21:39.2431950Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-08-14T21:21:39.2496920Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-08-14T21:21:39.2553656Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-08-14T21:21:39.2614046Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-08-14T21:21:39.2671544Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-08-14T21:21:39.2729577Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-08-14T21:21:39.2786735Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-08-14T21:21:39.2846862Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-08-14T21:21:39.2908584Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-08-14T21:21:39.2991000Z Entering 'third_party/pocketfft' 2025-08-14T21:21:39.3057113Z Entering 'third_party/protobuf' 2025-08-14T21:21:39.3121701Z Entering 'third_party/protobuf/third_party/benchmark' 2025-08-14T21:21:39.3179369Z Entering 'third_party/protobuf/third_party/googletest' 2025-08-14T21:21:39.3243202Z Entering 'third_party/psimd' 2025-08-14T21:21:39.3304376Z Entering 'third_party/pthreadpool' 2025-08-14T21:21:39.3364800Z Entering 'third_party/pybind11' 2025-08-14T21:21:39.3425212Z Entering 'third_party/python-peachpy' 2025-08-14T21:21:39.3486665Z Entering 'third_party/sleef' 2025-08-14T21:21:39.3549208Z Entering 'third_party/tensorpipe' 2025-08-14T21:21:39.3608952Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-08-14T21:21:39.3670509Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-08-14T21:21:39.3731013Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-08-14T21:21:39.3789423Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-08-14T21:21:39.3844859Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-08-14T21:21:39.3926911Z ##[endgroup] 2025-08-14T21:21:39.3927372Z ##[group]Persisting credentials for submodules 2025-08-14T21:21:39.3933451Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || :" 2025-08-14T21:21:39.4334845Z Entering 'android/libs/fbjni' 2025-08-14T21:21:39.4414652Z Entering 'third_party/FP16' 2025-08-14T21:21:39.4493461Z Entering 'third_party/FXdiv' 2025-08-14T21:21:39.4572706Z Entering 'third_party/NNPACK' 2025-08-14T21:21:39.4651663Z Entering 'third_party/NVTX' 2025-08-14T21:21:39.4729946Z Entering 'third_party/VulkanMemoryAllocator' 2025-08-14T21:21:39.4811372Z Entering 'third_party/XNNPACK' 2025-08-14T21:21:39.4906941Z Entering 'third_party/aiter' 2025-08-14T21:21:39.4985669Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-08-14T21:21:39.5074580Z Entering 'third_party/benchmark' 2025-08-14T21:21:39.5154240Z Entering 'third_party/composable_kernel' 2025-08-14T21:21:39.5240014Z Entering 'third_party/cpp-httplib' 2025-08-14T21:21:39.5320027Z Entering 'third_party/cpuinfo' 2025-08-14T21:21:39.5399404Z Entering 'third_party/cudnn_frontend' 2025-08-14T21:21:39.5478090Z Entering 'third_party/cutlass' 2025-08-14T21:21:39.5567689Z Entering 'third_party/fbgemm' 2025-08-14T21:21:39.5647173Z Entering 'third_party/fbgemm/external/asmjit' 2025-08-14T21:21:39.5724323Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-08-14T21:21:39.5808680Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-08-14T21:21:39.5884988Z Entering 'third_party/fbgemm/external/cutlass' 2025-08-14T21:21:39.5971233Z Entering 'third_party/fbgemm/external/googletest' 2025-08-14T21:21:39.6049182Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-08-14T21:21:39.6125114Z Entering 'third_party/fbgemm/external/json' 2025-08-14T21:21:39.6210267Z Entering 'third_party/flash-attention' 2025-08-14T21:21:39.6293399Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-08-14T21:21:39.6376001Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-08-14T21:21:39.6465513Z Entering 'third_party/flatbuffers' 2025-08-14T21:21:39.6545897Z Entering 'third_party/fmt' 2025-08-14T21:21:39.6628163Z Entering 'third_party/gemmlowp/gemmlowp' 2025-08-14T21:21:39.6708678Z Entering 'third_party/gloo' 2025-08-14T21:21:39.6789960Z Entering 'third_party/googletest' 2025-08-14T21:21:39.6872106Z Entering 'third_party/ideep' 2025-08-14T21:21:39.6955651Z Entering 'third_party/ideep/mkl-dnn' 2025-08-14T21:21:39.7040810Z Entering 'third_party/ittapi' 2025-08-14T21:21:39.7121061Z Entering 'third_party/kineto' 2025-08-14T21:21:39.7199055Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-08-14T21:21:39.7277128Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-08-14T21:21:39.7355671Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-08-14T21:21:39.7433347Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-08-14T21:21:39.7511693Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-08-14T21:21:39.7590692Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-08-14T21:21:39.7673869Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-08-14T21:21:39.7752038Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-08-14T21:21:39.7829946Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-08-14T21:21:39.7909620Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-08-14T21:21:39.7991952Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-08-14T21:21:39.8071230Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-08-14T21:21:39.8156043Z Entering 'third_party/kleidiai' 2025-08-14T21:21:39.8238870Z Entering 'third_party/mimalloc' 2025-08-14T21:21:39.8319220Z Entering 'third_party/nlohmann' 2025-08-14T21:21:39.8401722Z Entering 'third_party/onnx' 2025-08-14T21:21:39.8506141Z Entering 'third_party/onnx/third_party/pybind11' 2025-08-14T21:21:39.8590734Z Entering 'third_party/opentelemetry-cpp' 2025-08-14T21:21:39.8668746Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-08-14T21:21:39.8745123Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-08-14T21:21:39.8822266Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-08-14T21:21:39.8901042Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-08-14T21:21:39.8978961Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-08-14T21:21:39.9055275Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-08-14T21:21:39.9133778Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-08-14T21:21:39.9209236Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-08-14T21:21:39.9289493Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-08-14T21:21:39.9372050Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-08-14T21:21:39.9473332Z Entering 'third_party/pocketfft' 2025-08-14T21:21:39.9551254Z Entering 'third_party/protobuf' 2025-08-14T21:21:39.9634410Z Entering 'third_party/protobuf/third_party/benchmark' 2025-08-14T21:21:39.9711542Z Entering 'third_party/protobuf/third_party/googletest' 2025-08-14T21:21:39.9793713Z Entering 'third_party/psimd' 2025-08-14T21:21:39.9872059Z Entering 'third_party/pthreadpool' 2025-08-14T21:21:39.9952026Z Entering 'third_party/pybind11' 2025-08-14T21:21:40.0029847Z Entering 'third_party/python-peachpy' 2025-08-14T21:21:40.0109104Z Entering 'third_party/sleef' 2025-08-14T21:21:40.0191277Z Entering 'third_party/tensorpipe' 2025-08-14T21:21:40.0267417Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-08-14T21:21:40.0344066Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-08-14T21:21:40.0420266Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-08-14T21:21:40.0497207Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-08-14T21:21:40.0572011Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-08-14T21:21:40.0678634Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url" 2025-08-14T21:21:40.1080732Z Entering 'android/libs/fbjni' 2025-08-14T21:21:40.1152193Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-08-14T21:21:40.1177175Z Entering 'third_party/FP16' 2025-08-14T21:21:40.1252247Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-08-14T21:21:40.1277501Z Entering 'third_party/FXdiv' 2025-08-14T21:21:40.1350008Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-08-14T21:21:40.1375382Z Entering 'third_party/NNPACK' 2025-08-14T21:21:40.1449041Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-08-14T21:21:40.1474503Z Entering 'third_party/NVTX' 2025-08-14T21:21:40.1546839Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-08-14T21:21:40.1572333Z Entering 'third_party/VulkanMemoryAllocator' 2025-08-14T21:21:40.1646122Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-08-14T21:21:40.1674327Z Entering 'third_party/XNNPACK' 2025-08-14T21:21:40.1746707Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-08-14T21:21:40.1787544Z Entering 'third_party/aiter' 2025-08-14T21:21:40.1860391Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-08-14T21:21:40.1885566Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-08-14T21:21:40.1957382Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-08-14T21:21:40.1992852Z Entering 'third_party/benchmark' 2025-08-14T21:21:40.2066647Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-08-14T21:21:40.2090899Z Entering 'third_party/composable_kernel' 2025-08-14T21:21:40.2166347Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-08-14T21:21:40.2199715Z Entering 'third_party/cpp-httplib' 2025-08-14T21:21:40.2272703Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-08-14T21:21:40.2297444Z Entering 'third_party/cpuinfo' 2025-08-14T21:21:40.2370597Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-08-14T21:21:40.2395815Z Entering 'third_party/cudnn_frontend' 2025-08-14T21:21:40.2470154Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-08-14T21:21:40.2495530Z Entering 'third_party/cutlass' 2025-08-14T21:21:40.2568718Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-08-14T21:21:40.2602789Z Entering 'third_party/fbgemm' 2025-08-14T21:21:40.2675630Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-08-14T21:21:40.2700948Z Entering 'third_party/fbgemm/external/asmjit' 2025-08-14T21:21:40.2778975Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-08-14T21:21:40.2803415Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-08-14T21:21:40.2874982Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-08-14T21:21:40.2905159Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-08-14T21:21:40.2977636Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-08-14T21:21:40.3003667Z Entering 'third_party/fbgemm/external/cutlass' 2025-08-14T21:21:40.3076434Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-08-14T21:21:40.3111120Z Entering 'third_party/fbgemm/external/googletest' 2025-08-14T21:21:40.3182789Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-08-14T21:21:40.3206588Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-08-14T21:21:40.3278139Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-08-14T21:21:40.3301859Z Entering 'third_party/fbgemm/external/json' 2025-08-14T21:21:40.3375142Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-08-14T21:21:40.3407511Z Entering 'third_party/flash-attention' 2025-08-14T21:21:40.3487147Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-08-14T21:21:40.3514170Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-08-14T21:21:40.3584806Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-08-14T21:21:40.3614737Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-08-14T21:21:40.3685530Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-08-14T21:21:40.3721352Z Entering 'third_party/flatbuffers' 2025-08-14T21:21:40.3795910Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-08-14T21:21:40.3824092Z Entering 'third_party/fmt' 2025-08-14T21:21:40.3899595Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-08-14T21:21:40.3924730Z Entering 'third_party/gemmlowp/gemmlowp' 2025-08-14T21:21:40.3996024Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-08-14T21:21:40.4021296Z Entering 'third_party/gloo' 2025-08-14T21:21:40.4091051Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-08-14T21:21:40.4116715Z Entering 'third_party/googletest' 2025-08-14T21:21:40.4190586Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-08-14T21:21:40.4215773Z Entering 'third_party/ideep' 2025-08-14T21:21:40.4289549Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-08-14T21:21:40.4312811Z Entering 'third_party/ideep/mkl-dnn' 2025-08-14T21:21:40.4384366Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-08-14T21:21:40.4418023Z Entering 'third_party/ittapi' 2025-08-14T21:21:40.4491630Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-08-14T21:21:40.4517031Z Entering 'third_party/kineto' 2025-08-14T21:21:40.4587996Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-08-14T21:21:40.4610558Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-08-14T21:21:40.4687547Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-08-14T21:21:40.4710519Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-08-14T21:21:40.4782787Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-08-14T21:21:40.4808522Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-08-14T21:21:40.4880513Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-08-14T21:21:40.4905441Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-08-14T21:21:40.4976952Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-08-14T21:21:40.5001097Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-08-14T21:21:40.5075419Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-08-14T21:21:40.5096022Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-08-14T21:21:40.5172537Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-08-14T21:21:40.5200840Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-08-14T21:21:40.5272473Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-08-14T21:21:40.5297367Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-08-14T21:21:40.5369270Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-08-14T21:21:40.5393722Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-08-14T21:21:40.5466327Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-08-14T21:21:40.5491673Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-08-14T21:21:40.5564088Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-08-14T21:21:40.5591785Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-08-14T21:21:40.5662969Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-08-14T21:21:40.5687092Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-08-14T21:21:40.5760210Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-08-14T21:21:40.5787617Z Entering 'third_party/kleidiai' 2025-08-14T21:21:40.5858483Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-08-14T21:21:40.5885000Z Entering 'third_party/mimalloc' 2025-08-14T21:21:40.5957375Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-08-14T21:21:40.5982474Z Entering 'third_party/nlohmann' 2025-08-14T21:21:40.6053641Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-08-14T21:21:40.6078933Z Entering 'third_party/onnx' 2025-08-14T21:21:40.6155530Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-08-14T21:21:40.6196626Z Entering 'third_party/onnx/third_party/pybind11' 2025-08-14T21:21:40.6272184Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-08-14T21:21:40.6301996Z Entering 'third_party/opentelemetry-cpp' 2025-08-14T21:21:40.6373527Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-08-14T21:21:40.6399541Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-08-14T21:21:40.6472129Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-08-14T21:21:40.6496047Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-08-14T21:21:40.6568375Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-08-14T21:21:40.6592539Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-08-14T21:21:40.6664385Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-08-14T21:21:40.6687482Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-08-14T21:21:40.6757797Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-08-14T21:21:40.6782784Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-08-14T21:21:40.6855461Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-08-14T21:21:40.6879162Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-08-14T21:21:40.6951222Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-08-14T21:21:40.6974324Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-08-14T21:21:40.7046084Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-08-14T21:21:40.7067058Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-08-14T21:21:40.7139402Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-08-14T21:21:40.7166929Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-08-14T21:21:40.7237773Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-08-14T21:21:40.7267499Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-08-14T21:21:40.7337922Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-08-14T21:21:40.7385031Z Entering 'third_party/pocketfft' 2025-08-14T21:21:40.7456998Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-08-14T21:21:40.7481939Z Entering 'third_party/protobuf' 2025-08-14T21:21:40.7556892Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-08-14T21:21:40.7583138Z Entering 'third_party/protobuf/third_party/benchmark' 2025-08-14T21:21:40.7658073Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-08-14T21:21:40.7681810Z Entering 'third_party/protobuf/third_party/googletest' 2025-08-14T21:21:40.7752673Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-08-14T21:21:40.7780568Z Entering 'third_party/psimd' 2025-08-14T21:21:40.7859561Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-08-14T21:21:40.7885349Z Entering 'third_party/pthreadpool' 2025-08-14T21:21:40.7956900Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-08-14T21:21:40.7982021Z Entering 'third_party/pybind11' 2025-08-14T21:21:40.8062885Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-08-14T21:21:40.8087879Z Entering 'third_party/python-peachpy' 2025-08-14T21:21:40.8159685Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-08-14T21:21:40.8184978Z Entering 'third_party/sleef' 2025-08-14T21:21:40.8256262Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-08-14T21:21:40.8283597Z Entering 'third_party/tensorpipe' 2025-08-14T21:21:40.8357947Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-08-14T21:21:40.8382302Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-08-14T21:21:40.8452947Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-08-14T21:21:40.8476369Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-08-14T21:21:40.8548130Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-08-14T21:21:40.8570432Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-08-14T21:21:40.8643231Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-08-14T21:21:40.8666805Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-08-14T21:21:40.8737915Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-08-14T21:21:40.8759469Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-08-14T21:21:40.8835770Z file:/home/ec2-user/actions-runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-08-14T21:21:40.9753542Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2025-08-14T21:21:41.0163049Z Entering 'android/libs/fbjni' 2025-08-14T21:21:41.0224060Z Entering 'third_party/FP16' 2025-08-14T21:21:41.0284625Z Entering 'third_party/FXdiv' 2025-08-14T21:21:41.0344694Z Entering 'third_party/NNPACK' 2025-08-14T21:21:41.0406298Z Entering 'third_party/NVTX' 2025-08-14T21:21:41.0468763Z Entering 'third_party/VulkanMemoryAllocator' 2025-08-14T21:21:41.0529393Z Entering 'third_party/XNNPACK' 2025-08-14T21:21:41.0605813Z Entering 'third_party/aiter' 2025-08-14T21:21:41.0667572Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-08-14T21:21:41.0741099Z Entering 'third_party/benchmark' 2025-08-14T21:21:41.0802514Z Entering 'third_party/composable_kernel' 2025-08-14T21:21:41.0871877Z Entering 'third_party/cpp-httplib' 2025-08-14T21:21:41.0933113Z Entering 'third_party/cpuinfo' 2025-08-14T21:21:41.0994663Z Entering 'third_party/cudnn_frontend' 2025-08-14T21:21:41.1055252Z Entering 'third_party/cutlass' 2025-08-14T21:21:41.1124206Z Entering 'third_party/fbgemm' 2025-08-14T21:21:41.1187448Z Entering 'third_party/fbgemm/external/asmjit' 2025-08-14T21:21:41.1247442Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-08-14T21:21:41.1314283Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-08-14T21:21:41.1377340Z Entering 'third_party/fbgemm/external/cutlass' 2025-08-14T21:21:41.1443307Z Entering 'third_party/fbgemm/external/googletest' 2025-08-14T21:21:41.1503027Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-08-14T21:21:41.1562131Z Entering 'third_party/fbgemm/external/json' 2025-08-14T21:21:41.1628754Z Entering 'third_party/flash-attention' 2025-08-14T21:21:41.1691150Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-08-14T21:21:41.1755296Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-08-14T21:21:41.1830684Z Entering 'third_party/flatbuffers' 2025-08-14T21:21:41.1896857Z Entering 'third_party/fmt' 2025-08-14T21:21:41.1960114Z Entering 'third_party/gemmlowp/gemmlowp' 2025-08-14T21:21:41.2020791Z Entering 'third_party/gloo' 2025-08-14T21:21:41.2082341Z Entering 'third_party/googletest' 2025-08-14T21:21:41.2143408Z Entering 'third_party/ideep' 2025-08-14T21:21:41.2203159Z Entering 'third_party/ideep/mkl-dnn' 2025-08-14T21:21:41.2273569Z Entering 'third_party/ittapi' 2025-08-14T21:21:41.2334231Z Entering 'third_party/kineto' 2025-08-14T21:21:41.2393986Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-08-14T21:21:41.2451854Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-08-14T21:21:41.2516149Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-08-14T21:21:41.2574985Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-08-14T21:21:41.2633348Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-08-14T21:21:41.2692579Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-08-14T21:21:41.2758296Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-08-14T21:21:41.2817920Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-08-14T21:21:41.2877852Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-08-14T21:21:41.2939834Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-08-14T21:21:41.3003844Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-08-14T21:21:41.3062535Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-08-14T21:21:41.3124878Z Entering 'third_party/kleidiai' 2025-08-14T21:21:41.3187406Z Entering 'third_party/mimalloc' 2025-08-14T21:21:41.3249846Z Entering 'third_party/nlohmann' 2025-08-14T21:21:41.3311859Z Entering 'third_party/onnx' 2025-08-14T21:21:41.3388814Z Entering 'third_party/onnx/third_party/pybind11' 2025-08-14T21:21:41.3455227Z Entering 'third_party/opentelemetry-cpp' 2025-08-14T21:21:41.3518097Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-08-14T21:21:41.3576693Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-08-14T21:21:41.3634320Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-08-14T21:21:41.3695525Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-08-14T21:21:41.3754217Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-08-14T21:21:41.3810795Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-08-14T21:21:41.3869357Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-08-14T21:21:41.3924618Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-08-14T21:21:41.3986301Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-08-14T21:21:41.4049265Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-08-14T21:21:41.4128375Z Entering 'third_party/pocketfft' 2025-08-14T21:21:41.4191422Z Entering 'third_party/protobuf' 2025-08-14T21:21:41.4254739Z Entering 'third_party/protobuf/third_party/benchmark' 2025-08-14T21:21:41.4312225Z Entering 'third_party/protobuf/third_party/googletest' 2025-08-14T21:21:41.4376316Z Entering 'third_party/psimd' 2025-08-14T21:21:41.4437439Z Entering 'third_party/pthreadpool' 2025-08-14T21:21:41.4497208Z Entering 'third_party/pybind11' 2025-08-14T21:21:41.4561966Z Entering 'third_party/python-peachpy' 2025-08-14T21:21:41.4622997Z Entering 'third_party/sleef' 2025-08-14T21:21:41.4684612Z Entering 'third_party/tensorpipe' 2025-08-14T21:21:41.4743435Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-08-14T21:21:41.4802965Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-08-14T21:21:41.4865426Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-08-14T21:21:41.4923204Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-08-14T21:21:41.4979261Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-08-14T21:21:41.5069575Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2025-08-14T21:21:41.5469209Z Entering 'android/libs/fbjni' 2025-08-14T21:21:41.5530011Z Entering 'third_party/FP16' 2025-08-14T21:21:41.5589827Z Entering 'third_party/FXdiv' 2025-08-14T21:21:41.5649184Z Entering 'third_party/NNPACK' 2025-08-14T21:21:41.5709669Z Entering 'third_party/NVTX' 2025-08-14T21:21:41.5773945Z Entering 'third_party/VulkanMemoryAllocator' 2025-08-14T21:21:41.5833370Z Entering 'third_party/XNNPACK' 2025-08-14T21:21:41.5908292Z Entering 'third_party/aiter' 2025-08-14T21:21:41.5970226Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-08-14T21:21:41.6040875Z Entering 'third_party/benchmark' 2025-08-14T21:21:41.6101684Z Entering 'third_party/composable_kernel' 2025-08-14T21:21:41.6170672Z Entering 'third_party/cpp-httplib' 2025-08-14T21:21:41.6229347Z Entering 'third_party/cpuinfo' 2025-08-14T21:21:41.6290031Z Entering 'third_party/cudnn_frontend' 2025-08-14T21:21:41.6349692Z Entering 'third_party/cutlass' 2025-08-14T21:21:41.6422495Z Entering 'third_party/fbgemm' 2025-08-14T21:21:41.6485454Z Entering 'third_party/fbgemm/external/asmjit' 2025-08-14T21:21:41.6545035Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-08-14T21:21:41.6616140Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-08-14T21:21:41.6674828Z Entering 'third_party/fbgemm/external/cutlass' 2025-08-14T21:21:41.6742144Z Entering 'third_party/fbgemm/external/googletest' 2025-08-14T21:21:41.6800334Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-08-14T21:21:41.6858849Z Entering 'third_party/fbgemm/external/json' 2025-08-14T21:21:41.6922519Z Entering 'third_party/flash-attention' 2025-08-14T21:21:41.7013609Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-08-14T21:21:41.7079245Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-08-14T21:21:41.7155067Z Entering 'third_party/flatbuffers' 2025-08-14T21:21:41.7218579Z Entering 'third_party/fmt' 2025-08-14T21:21:41.7283080Z Entering 'third_party/gemmlowp/gemmlowp' 2025-08-14T21:21:41.7342649Z Entering 'third_party/gloo' 2025-08-14T21:21:41.7402129Z Entering 'third_party/googletest' 2025-08-14T21:21:41.7463609Z Entering 'third_party/ideep' 2025-08-14T21:21:41.7521810Z Entering 'third_party/ideep/mkl-dnn' 2025-08-14T21:21:41.7589623Z Entering 'third_party/ittapi' 2025-08-14T21:21:41.7648524Z Entering 'third_party/kineto' 2025-08-14T21:21:41.7710348Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-08-14T21:21:41.7768569Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-08-14T21:21:41.7829367Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-08-14T21:21:41.7889012Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-08-14T21:21:41.7947273Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-08-14T21:21:41.8005195Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-08-14T21:21:41.8069939Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-08-14T21:21:41.8128415Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-08-14T21:21:41.8186882Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-08-14T21:21:41.8249651Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-08-14T21:21:41.8312660Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-08-14T21:21:41.8373114Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-08-14T21:21:41.8435130Z Entering 'third_party/kleidiai' 2025-08-14T21:21:41.8498271Z Entering 'third_party/mimalloc' 2025-08-14T21:21:41.8558759Z Entering 'third_party/nlohmann' 2025-08-14T21:21:41.8621061Z Entering 'third_party/onnx' 2025-08-14T21:21:41.8697520Z Entering 'third_party/onnx/third_party/pybind11' 2025-08-14T21:21:41.8766066Z Entering 'third_party/opentelemetry-cpp' 2025-08-14T21:21:41.8826180Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-08-14T21:21:41.8885237Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-08-14T21:21:41.8949900Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-08-14T21:21:41.9011712Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-08-14T21:21:41.9069717Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-08-14T21:21:41.9126163Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-08-14T21:21:41.9187600Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-08-14T21:21:41.9248411Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-08-14T21:21:41.9308023Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-08-14T21:21:41.9372235Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-08-14T21:21:41.9456627Z Entering 'third_party/pocketfft' 2025-08-14T21:21:41.9516949Z Entering 'third_party/protobuf' 2025-08-14T21:21:41.9578774Z Entering 'third_party/protobuf/third_party/benchmark' 2025-08-14T21:21:41.9635440Z Entering 'third_party/protobuf/third_party/googletest' 2025-08-14T21:21:41.9697359Z Entering 'third_party/psimd' 2025-08-14T21:21:41.9756810Z Entering 'third_party/pthreadpool' 2025-08-14T21:21:41.9816773Z Entering 'third_party/pybind11' 2025-08-14T21:21:41.9880472Z Entering 'third_party/python-peachpy' 2025-08-14T21:21:41.9941267Z Entering 'third_party/sleef' 2025-08-14T21:21:42.0007129Z Entering 'third_party/tensorpipe' 2025-08-14T21:21:42.0070655Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-08-14T21:21:42.0130140Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-08-14T21:21:42.0189063Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-08-14T21:21:42.0248277Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-08-14T21:21:42.0304930Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-08-14T21:21:42.0387359Z ##[endgroup] 2025-08-14T21:21:42.0433483Z [command]/usr/bin/git log -1 --format=%H 2025-08-14T21:21:42.0466345Z 1fc683cf17c8c673044538d10266c00f92987be2 2025-08-14T21:21:42.0674088Z Prepare all required actions 2025-08-14T21:21:42.0674596Z Getting action download info 2025-08-14T21:21:42.2205981Z ##[group]Run ./.github/actions/setup-linux 2025-08-14T21:21:42.2206284Z env: 2025-08-14T21:21:42.2206495Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:21:42.2206737Z ##[endgroup] 2025-08-14T21:21:42.2253945Z ##[group]Run set -euo pipefail 2025-08-14T21:21:42.2254264Z set -euo pipefail 2025-08-14T21:21:42.2254542Z function get_ec2_metadata() { 2025-08-14T21:21:42.2254893Z  # Pulled from instance metadata endpoint for EC2 2025-08-14T21:21:42.2255473Z  # see https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html 2025-08-14T21:21:42.2255985Z  category=$1 2025-08-14T21:21:42.2256321Z  # If it is GCP runner (runner name contains gcp), do not run this 2025-08-14T21:21:42.2256732Z  runner_name_str=i-006ff0d0e48756a9a 2025-08-14T21:21:42.2257088Z  if [[ -f /.inarc ]]; then 2025-08-14T21:21:42.2257424Z  echo "ARC Runner, no info on ec2 metadata" 2025-08-14T21:21:42.2257793Z  elif [[ $runner_name_str == *"gcp"* ]]; then 2025-08-14T21:21:42.2258231Z  echo "Runner is from Google Cloud Platform, No info on ec2 metadata" 2025-08-14T21:21:42.2258622Z  else 2025-08-14T21:21:42.2259390Z  curl -H "X-aws-ec2-metadata-token: $(curl -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 30")" -fsSL "http://169.254.169.254/latest/meta-data/${category}" 2025-08-14T21:21:42.2260203Z  fi 2025-08-14T21:21:42.2260422Z } 2025-08-14T21:21:42.2260679Z echo "ami-id: $(get_ec2_metadata ami-id)" 2025-08-14T21:21:42.2261083Z echo "instance-id: $(get_ec2_metadata instance-id)" 2025-08-14T21:21:42.2261534Z echo "instance-type: $(get_ec2_metadata instance-type)" 2025-08-14T21:21:42.2261923Z echo "system info $(uname -a)" 2025-08-14T21:21:42.2273135Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:21:42.2273490Z env: 2025-08-14T21:21:42.2273711Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:21:42.2273955Z ##[endgroup] 2025-08-14T21:21:42.2460009Z ami-id: ami-05ffe3c48a9991133 2025-08-14T21:21:42.2587749Z instance-id: i-006ff0d0e48756a9a 2025-08-14T21:21:42.2715572Z instance-type: g5.4xlarge 2025-08-14T21:21:42.2730899Z system info Linux ip-10-0-74-8.ec2.internal 6.1.141-155.222.amzn2023.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Jun 17 10:29:47 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux 2025-08-14T21:21:42.2762995Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-08-14T21:21:42.2763889Z echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-08-14T21:21:42.2774240Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:21:42.2774622Z env: 2025-08-14T21:21:42.2774834Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:21:42.2775096Z ##[endgroup] 2025-08-14T21:21:42.2846645Z ##[group]Run if systemctl is-active --quiet docker; then 2025-08-14T21:21:42.2847420Z if systemctl is-active --quiet docker; then 2025-08-14T21:21:42.2847919Z  echo "Docker daemon is running..."; 2025-08-14T21:21:42.2848359Z else 2025-08-14T21:21:42.2848780Z  echo "Starting docker daemon..." && sudo systemctl start docker; 2025-08-14T21:21:42.2849175Z fi 2025-08-14T21:21:42.2857501Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:21:42.2857869Z env: 2025-08-14T21:21:42.2858095Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:21:42.2858361Z ##[endgroup] 2025-08-14T21:21:42.2957560Z Docker daemon is running... 2025-08-14T21:21:42.3006537Z ##[group]Run nick-fields/retry@v3.0.0 2025-08-14T21:21:42.3006824Z with: 2025-08-14T21:21:42.3007026Z shell: bash 2025-08-14T21:21:42.3007463Z timeout_minutes: 5 2025-08-14T21:21:42.3007722Z max_attempts: 3 2025-08-14T21:21:42.3007955Z retry_wait_seconds: 30 2025-08-14T21:21:42.3010006Z command: AWS_ACCOUNT_ID=$(aws sts get-caller-identity|grep Account|cut -f4 -d\") aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \ --password-stdin "$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com" # For LF Runners we need to make sure we also login to Meta's ECR docker registry too. META_AWS_ACCOUNT_ID=308535385114 if [ "$AWS_ACCOUNT_ID" != "$META_AWS_ACCOUNT_ID" ] ; then aws ecr get-login-password --region "$AWS_DEFAULT_REGION" | docker login --username AWS \ --password-stdin "$META_AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com" fi 2025-08-14T21:21:42.3012054Z polling_interval_seconds: 1 2025-08-14T21:21:42.3012335Z warning_on_retry: true 2025-08-14T21:21:42.3012592Z continue_on_error: false 2025-08-14T21:21:42.3012833Z env: 2025-08-14T21:21:42.3013049Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:21:42.3013314Z AWS_RETRY_MODE: standard 2025-08-14T21:21:42.3013561Z AWS_MAX_ATTEMPTS: 5 2025-08-14T21:21:42.3013818Z AWS_DEFAULT_REGION: us-east-1 2025-08-14T21:21:42.3014081Z ##[endgroup] 2025-08-14T21:21:43.4529965Z WARNING! Your password will be stored unencrypted in /home/ec2-user/.docker/config.json. 2025-08-14T21:21:43.4530525Z Configure a credential helper to remove this warning. See 2025-08-14T21:21:43.4531064Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2025-08-14T21:21:43.4531425Z 2025-08-14T21:21:43.4531804Z Login Succeeded 2025-08-14T21:21:44.3891415Z Command completed after 1 attempt(s). 2025-08-14T21:21:44.3976937Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-08-14T21:21:44.3977431Z env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-08-14T21:21:44.3977866Z env | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-08-14T21:21:44.3993658Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:21:44.3994421Z env: 2025-08-14T21:21:44.3994734Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:21:44.3995130Z ##[endgroup] 2025-08-14T21:21:44.4107573Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2025-08-14T21:21:44.4108099Z # ignore expansion of "docker ps -q" since it could be empty 2025-08-14T21:21:44.4108500Z # shellcheck disable=SC2046 2025-08-14T21:21:44.4108827Z docker stop $(docker ps -q) || true 2025-08-14T21:21:44.4109155Z # Prune all of the docker images 2025-08-14T21:21:44.4109469Z docker system prune -af 2025-08-14T21:21:44.4118067Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:21:44.4118410Z env: 2025-08-14T21:21:44.4118622Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:21:44.4118878Z ##[endgroup] 2025-08-14T21:21:44.4403517Z "docker stop" requires at least 1 argument. 2025-08-14T21:21:44.4404047Z See 'docker stop --help'. 2025-08-14T21:21:44.4404266Z 2025-08-14T21:21:44.4404663Z Usage: docker stop [OPTIONS] CONTAINER [CONTAINER...] 2025-08-14T21:21:44.4405008Z 2025-08-14T21:21:44.4405155Z Stop one or more running containers 2025-08-14T21:21:44.4650138Z Total reclaimed space: 0B 2025-08-14T21:21:44.4707255Z ##[group]Run set +e 2025-08-14T21:21:44.4707525Z set +e 2025-08-14T21:21:44.4707749Z set -x 2025-08-14T21:21:44.4707972Z  2025-08-14T21:21:44.4708215Z PT_DOMAIN=download.pytorch.org 2025-08-14T21:21:44.4708756Z # TODO: Flaky access to download.pytorch.org https://github.com/pytorch/pytorch/issues/100400, 2025-08-14T21:21:44.4709436Z # cleaning this up once the issue is fixed. There are more than one resolved IP here, the last 2025-08-14T21:21:44.4709920Z # one is returned at random 2025-08-14T21:21:44.4710298Z RESOLVED_IP=$(dig -4 +short "${PT_DOMAIN}" | tail -n1) 2025-08-14T21:21:44.4710649Z  2025-08-14T21:21:44.4711063Z if [ -z "${RESOLVED_IP}" ]; then 2025-08-14T21:21:44.4711489Z  echo "Couldn't resolve ${PT_DOMAIN}, retrying with Google DNS..." 2025-08-14T21:21:44.4711976Z  RESOLVED_IP=$(dig -4 +short "${PT_DOMAIN}" @8.8.8.8 | tail -n1) 2025-08-14T21:21:44.4712334Z  2025-08-14T21:21:44.4712568Z  if [ -z "${RESOLVED_IP}" ]; then 2025-08-14T21:21:44.4712934Z  echo "Couldn't resolve ${PT_DOMAIN}, exiting..." 2025-08-14T21:21:44.4713270Z  exit 1 2025-08-14T21:21:44.4713500Z  fi 2025-08-14T21:21:44.4713709Z fi 2025-08-14T21:21:44.4713911Z  2025-08-14T21:21:44.4714166Z if grep -r "${PT_DOMAIN}" /etc/hosts; then 2025-08-14T21:21:44.4714517Z  # Clean up any old records first 2025-08-14T21:21:44.4714860Z  sudo sed -i "/${PT_DOMAIN}/d" /etc/hosts 2025-08-14T21:21:44.4715160Z fi 2025-08-14T21:21:44.4715363Z  2025-08-14T21:21:44.4715667Z echo "${RESOLVED_IP} ${PT_DOMAIN}" | sudo tee -a /etc/hosts 2025-08-14T21:21:44.4716039Z cat /etc/hosts 2025-08-14T21:21:44.4725188Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:21:44.4725535Z env: 2025-08-14T21:21:44.4725747Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:21:44.4726002Z ##[endgroup] 2025-08-14T21:21:44.4756929Z + PT_DOMAIN=download.pytorch.org 2025-08-14T21:21:44.4764355Z ++ dig -4 +short download.pytorch.org 2025-08-14T21:21:44.4765416Z ++ tail -n1 2025-08-14T21:21:44.5356104Z + RESOLVED_IP=18.160.10.28 2025-08-14T21:21:44.5356375Z + '[' -z 18.160.10.28 ']' 2025-08-14T21:21:44.5356651Z + grep -r download.pytorch.org /etc/hosts 2025-08-14T21:21:44.5377497Z + echo '18.160.10.28 download.pytorch.org' 2025-08-14T21:21:44.5378212Z + sudo tee -a /etc/hosts 2025-08-14T21:21:44.8905231Z 18.160.10.28 download.pytorch.org 2025-08-14T21:21:44.8927450Z + cat /etc/hosts 2025-08-14T21:21:44.8940812Z 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 2025-08-14T21:21:44.8947381Z ::1 localhost6 localhost6.localdomain6 2025-08-14T21:21:44.8948092Z 18.160.10.28 download.pytorch.org 2025-08-14T21:21:44.9122013Z ##[group]Run pytorch/test-infra/.github/actions/calculate-docker-image@main 2025-08-14T21:21:44.9122458Z with: 2025-08-14T21:21:44.9123197Z docker-image-name: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe 2025-08-14T21:21:44.9124027Z use-custom-docker-registry: true 2025-08-14T21:21:44.9124335Z docker-build-dir: .ci/docker 2025-08-14T21:21:44.9124767Z docker-build-script: ./build.sh 2025-08-14T21:21:44.9125057Z working-directory: . 2025-08-14T21:21:44.9125406Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-08-14T21:21:44.9125793Z force-push: false 2025-08-14T21:21:44.9126016Z env: 2025-08-14T21:21:44.9126233Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:21:44.9126494Z ##[endgroup] 2025-08-14T21:21:44.9159898Z ##[group]Run set -ex 2025-08-14T21:21:44.9160202Z set -ex 2025-08-14T21:21:44.9160432Z  2025-08-14T21:21:44.9160832Z # If the docker build directory or the build script doesn't exist, the action will 2025-08-14T21:21:44.9161452Z # gracefully return the docker image name as it is. Pulling docker image in Linux 2025-08-14T21:21:44.9161979Z # job could then download the pre-built image as usual 2025-08-14T21:21:44.9162614Z if [[ -d "${DOCKER_BUILD_DIR}" ]] && [[ -f "${DOCKER_BUILD_DIR}/${DOCKER_BUILD_SCRIPT}" ]] && [[ "${USE_CUSTOM_DOCKER_REGISTRY}" == "true" ]]; then 2025-08-14T21:21:44.9163205Z  echo "skip=false" >> "${GITHUB_OUTPUT}" 2025-08-14T21:21:44.9163515Z else 2025-08-14T21:21:44.9163773Z  echo "skip=true" >> "${GITHUB_OUTPUT}" 2025-08-14T21:21:44.9164238Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2025-08-14T21:21:44.9164725Z  2025-08-14T21:21:44.9165231Z  echo "Not using custom ECR registry. Either it was not requested or there is no Docker build script in the ${REPO_NAME} repo..." 2025-08-14T21:21:44.9165817Z  exit 0 2025-08-14T21:21:44.9166038Z fi 2025-08-14T21:21:44.9166400Z  2025-08-14T21:21:44.9166726Z if [[ "${DOCKER_IMAGE_NAME}" == *"${DOCKER_REGISTRY}/${REPO_NAME}"* ]]; then 2025-08-14T21:21:44.9167285Z  # The docker image name already includes the ECR prefix and tag, so we can just 2025-08-14T21:21:44.9167780Z  # use it as it is, but first let's extract the tag 2025-08-14T21:21:44.9168229Z  DOCKER_TAG=$(echo "${DOCKER_IMAGE_NAME}" | awk -F '[:,]' '{print $2}') 2025-08-14T21:21:44.9168708Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-08-14T21:21:44.9169170Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2025-08-14T21:21:44.9169678Z else 2025-08-14T21:21:44.9169934Z  if [[ "${DOCKER_IMAGE_NAME}" == *:* ]]; then 2025-08-14T21:21:44.9170312Z  CUSTOM_TAG_PREFIX=${DOCKER_IMAGE_NAME#*:} 2025-08-14T21:21:44.9170696Z  DOCKER_IMAGE_NAME=${DOCKER_IMAGE_NAME%%:*} 2025-08-14T21:21:44.9171010Z  fi 2025-08-14T21:21:44.9171436Z  DOCKER_TAG=${CUSTOM_TAG_PREFIX:+${CUSTOM_TAG_PREFIX}-}$(git rev-parse HEAD:"${DOCKER_BUILD_DIR}") 2025-08-14T21:21:44.9172002Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-08-14T21:21:44.9172596Z  echo "docker-image=${DOCKER_REGISTRY}/${REPO_NAME}/${DOCKER_IMAGE_NAME}:${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-08-14T21:21:44.9173234Z  echo "custom-tag-prefix=${CUSTOM_TAG_PREFIX}" >> "${GITHUB_OUTPUT}" 2025-08-14T21:21:44.9173643Z fi 2025-08-14T21:21:44.9184663Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:21:44.9185153Z env: 2025-08-14T21:21:44.9185374Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:21:44.9185633Z REPO_NAME: pytorch 2025-08-14T21:21:44.9186581Z DOCKER_IMAGE_NAME: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe 2025-08-14T21:21:44.9187522Z DOCKER_BUILD_DIR: .ci/docker 2025-08-14T21:21:44.9187808Z DOCKER_BUILD_SCRIPT: ./build.sh 2025-08-14T21:21:44.9188303Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-08-14T21:21:44.9188695Z USE_CUSTOM_DOCKER_REGISTRY: true 2025-08-14T21:21:44.9188989Z CUSTOM_TAG_PREFIX: 2025-08-14T21:21:44.9189225Z ##[endgroup] 2025-08-14T21:21:44.9224063Z + [[ -d .ci/docker ]] 2025-08-14T21:21:44.9224420Z + [[ -f .ci/docker/./build.sh ]] 2025-08-14T21:21:44.9224800Z + [[ true == \t\r\u\e ]] 2025-08-14T21:21:44.9225128Z + echo skip=false 2025-08-14T21:21:44.9226314Z + [[ 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe == *\3\0\8\5\3\5\3\8\5\1\1\4\.\d\k\r\.\e\c\r\.\u\s\-\e\a\s\t\-\1\.\a\m\a\z\o\n\a\w\s\.\c\o\m\/\p\y\t\o\r\c\h* ]] 2025-08-14T21:21:44.9233220Z ++ echo 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe 2025-08-14T21:21:44.9234003Z ++ awk -F '[:,]' '{print $2}' 2025-08-14T21:21:44.9263845Z + DOCKER_TAG=pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe 2025-08-14T21:21:44.9264757Z + echo docker-tag=pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe 2025-08-14T21:21:44.9265845Z + echo docker-image=308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe 2025-08-14T21:21:44.9303898Z ##[group]Run set +e 2025-08-14T21:21:44.9304214Z set +e 2025-08-14T21:21:44.9304457Z set -x 2025-08-14T21:21:44.9304673Z  2025-08-14T21:21:44.9304876Z login() { 2025-08-14T21:21:44.9305319Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2025-08-14T21:21:44.9305808Z } 2025-08-14T21:21:44.9306018Z  2025-08-14T21:21:44.9306222Z retry () { 2025-08-14T21:21:44.9306484Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2025-08-14T21:21:44.9306776Z } 2025-08-14T21:21:44.9306978Z  2025-08-14T21:21:44.9307205Z retry login "${DOCKER_REGISTRY}" 2025-08-14T21:21:44.9307489Z  2025-08-14T21:21:44.9307697Z START_TIME=$(date +%s) 2025-08-14T21:21:44.9307978Z # Wait up to 120 minutes 2025-08-14T21:21:44.9308325Z while [[ $(( $(date +%s) - 7200 )) -lt $START_TIME ]]; do 2025-08-14T21:21:44.9308778Z  # Check if image already exists, if it does then skip building it 2025-08-14T21:21:44.9309231Z  if docker manifest inspect "${DOCKER_IMAGE}"; then 2025-08-14T21:21:44.9309578Z  exit 0 2025-08-14T21:21:44.9309796Z  fi 2025-08-14T21:21:44.9310002Z  2025-08-14T21:21:44.9310375Z  # NB: This flag is used by Docker build workflow to push the image to ECR, so we can 2025-08-14T21:21:44.9310981Z  # use this to differentiate between the Docker build and regular build jobs. For the 2025-08-14T21:21:44.9311579Z  # latter, it will wait for the Docker images to become available before continuing 2025-08-14T21:21:44.9312064Z  if [ "${DOCKER_PUSH:-false}" == "true" ]; then 2025-08-14T21:21:44.9312440Z  # It's a Docker build job, let's build the image 2025-08-14T21:21:44.9312766Z  break 2025-08-14T21:21:44.9312983Z  else 2025-08-14T21:21:44.9313307Z  # It's a regular build job, wait for the image to become available 2025-08-14T21:21:44.9313695Z  sleep 300 2025-08-14T21:21:44.9313927Z  fi 2025-08-14T21:21:44.9314140Z done 2025-08-14T21:21:44.9314347Z  2025-08-14T21:21:44.9314669Z # NB: This part requires a full checkout. Otherwise, the merge base will 2025-08-14T21:21:44.9317250Z # be empty. The default action would be to continue rebuild the image 2025-08-14T21:21:44.9317748Z if [[ "$BASE_REVISION" = "$(git rev-parse HEAD)" ]]; then 2025-08-14T21:21:44.9318176Z  # if we're on the base branch then use the parent commit 2025-08-14T21:21:44.9318551Z  MERGE_BASE=$(git rev-parse HEAD~) 2025-08-14T21:21:44.9318855Z else 2025-08-14T21:21:44.9319170Z  # otherwise we're on a PR, so use the most recent base commit 2025-08-14T21:21:44.9319609Z  MERGE_BASE=$(git merge-base HEAD "$BASE_REVISION") 2025-08-14T21:21:44.9319950Z fi 2025-08-14T21:21:44.9320155Z  2025-08-14T21:21:44.9320378Z if [[ -z "${MERGE_BASE}" ]]; then 2025-08-14T21:21:44.9320719Z  echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2025-08-14T21:21:44.9321031Z  2025-08-14T21:21:44.9321466Z  echo "Finding merge base only works with full checkout, please set fetch-depth to 0, continuing ..." 2025-08-14T21:21:44.9321980Z  exit 0 2025-08-14T21:21:44.9322202Z fi 2025-08-14T21:21:44.9322414Z  2025-08-14T21:21:44.9322705Z if ! git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}"; then 2025-08-14T21:21:44.9323332Z  echo "Directory '${DOCKER_BUILD_DIR}' not found in commit $MERGE_BASE, you should rebase onto a more recent commit" 2025-08-14T21:21:44.9323866Z  exit 1 2025-08-14T21:21:44.9324094Z fi 2025-08-14T21:21:44.9324295Z  2025-08-14T21:21:44.9324773Z PREVIOUS_DOCKER_TAG=$(git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}") 2025-08-14T21:21:44.9325382Z # If no image exists but the hash is the same as the previous hash then we should error out here 2025-08-14T21:21:44.9325936Z if [[ "${PREVIOUS_DOCKER_TAG}" == "${DOCKER_TAG}" ]]; then 2025-08-14T21:21:44.9326558Z  echo "WARNING: Something has gone wrong and the previous image isn't available for the merge-base of your branch" 2025-08-14T21:21:44.9327265Z  echo " Will re-build docker image to store in local cache, TTS may be longer" 2025-08-14T21:21:44.9327687Z fi 2025-08-14T21:21:44.9327886Z  2025-08-14T21:21:44.9328138Z echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2025-08-14T21:21:44.9336829Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:21:44.9337180Z env: 2025-08-14T21:21:44.9337383Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:21:44.9337647Z DOCKER_BUILD_DIR: .ci/docker 2025-08-14T21:21:44.9337970Z BASE_REVISION: 1fc683cf17c8c673044538d10266c00f92987be2 2025-08-14T21:21:44.9338804Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe 2025-08-14T21:21:44.9339846Z DOCKER_TAG: pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe 2025-08-14T21:21:44.9340486Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-08-14T21:21:44.9340864Z DOCKER_PUSH: 2025-08-14T21:21:44.9341075Z ##[endgroup] 2025-08-14T21:21:44.9373344Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-08-14T21:21:44.9373761Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-08-14T21:21:44.9377334Z + aws ecr get-login-password --region us-east-1 2025-08-14T21:21:44.9378583Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-08-14T21:21:45.4873669Z WARNING! Your password will be stored unencrypted in /home/ec2-user/.docker/config.json. 2025-08-14T21:21:45.4874417Z Configure a credential helper to remove this warning. See 2025-08-14T21:21:45.4874947Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2025-08-14T21:21:45.4875299Z 2025-08-14T21:21:45.4876082Z Login Succeeded 2025-08-14T21:21:45.4902202Z ++ date +%s 2025-08-14T21:21:45.4915876Z + START_TIME=1755206505 2025-08-14T21:21:45.4920523Z ++ date +%s 2025-08-14T21:21:45.4934060Z + [[ 1755199305 -lt 1755206505 ]] 2025-08-14T21:21:45.4934893Z + docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe 2025-08-14T21:21:45.7568439Z { 2025-08-14T21:21:45.7568728Z "schemaVersion": 2, 2025-08-14T21:21:45.7569270Z "mediaType": "application/vnd.docker.distribution.manifest.v2+json", 2025-08-14T21:21:45.7569820Z "config": { 2025-08-14T21:21:45.7570237Z "mediaType": "application/vnd.docker.container.image.v1+json", 2025-08-14T21:21:45.7570733Z "size": 31137, 2025-08-14T21:21:45.7571247Z "digest": "sha256:9164ab1496895ac66808d8e6566acf1991fc744d321a8ac71d8dbd8989ac57a8" 2025-08-14T21:21:45.7571809Z }, 2025-08-14T21:21:45.7572060Z "layers": [ 2025-08-14T21:21:45.7572322Z { 2025-08-14T21:21:45.7572722Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7573257Z "size": 30448173, 2025-08-14T21:21:45.7573818Z "digest": "sha256:660ffc76f83b006444a5731b215acc2e35138d8be5cac8ed1ffd40f947117495" 2025-08-14T21:21:45.7574381Z }, 2025-08-14T21:21:45.7574630Z { 2025-08-14T21:21:45.7575040Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7575544Z "size": 1554, 2025-08-14T21:21:45.7576049Z "digest": "sha256:d6b6067d541bf983c727d7a41da76081d8c002d25fd4717b4a56fd118207690c" 2025-08-14T21:21:45.7576612Z }, 2025-08-14T21:21:45.7589882Z { 2025-08-14T21:21:45.7590325Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7590849Z "size": 313280567, 2025-08-14T21:21:45.7591394Z "digest": "sha256:3006f113d37efb0e518a59ee0e22ff04496a6ae0875ac1b01ab9343a9259b6d1" 2025-08-14T21:21:45.7592002Z }, 2025-08-14T21:21:45.7592250Z { 2025-08-14T21:21:45.7592680Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7593216Z "size": 791, 2025-08-14T21:21:45.7593785Z "digest": "sha256:4cddbecee0558395e2bac3e7d4c5b3d1675de080906873d4aea58b3903eda71a" 2025-08-14T21:21:45.7594413Z }, 2025-08-14T21:21:45.7594715Z { 2025-08-14T21:21:45.7595144Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7595675Z "size": 106, 2025-08-14T21:21:45.7596206Z "digest": "sha256:72e3bb031d09b83498ff4772e793868f3ae2b1c39bdbc4df3625f97f4539f0b5" 2025-08-14T21:21:45.7596814Z }, 2025-08-14T21:21:45.7597060Z { 2025-08-14T21:21:45.7597492Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7598038Z "size": 704, 2025-08-14T21:21:45.7598560Z "digest": "sha256:3f60d44af11c7a57a450fe66924f89ed879d298561bf0438e1ab39f905ac68c4" 2025-08-14T21:21:45.7599163Z }, 2025-08-14T21:21:45.7599417Z { 2025-08-14T21:21:45.7599836Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7600381Z "size": 1216, 2025-08-14T21:21:45.7600915Z "digest": "sha256:a80f570fd810851d40ee2d13e8e55e37c1aa9067769169a394f8ee78dfb67c82" 2025-08-14T21:21:45.7601513Z }, 2025-08-14T21:21:45.7601765Z { 2025-08-14T21:21:45.7602195Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7602715Z "size": 486, 2025-08-14T21:21:45.7603233Z "digest": "sha256:3b13dc7bd76c403264b97e45268521bff47f04837b1db3b09af99f77d4b5e8e9" 2025-08-14T21:21:45.7603790Z }, 2025-08-14T21:21:45.7603970Z { 2025-08-14T21:21:45.7604275Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7604738Z "size": 110343724, 2025-08-14T21:21:45.7605129Z "digest": "sha256:33e4e5b090ae0cd22ca9920d1b41c8c707983637630cd1d92096c4f0a2724052" 2025-08-14T21:21:45.7605544Z }, 2025-08-14T21:21:45.7605723Z { 2025-08-14T21:21:45.7606027Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7606407Z "size": 4815, 2025-08-14T21:21:45.7606774Z "digest": "sha256:322c959d41689696130170e2cb35b8c6498142bb2dadde0d8e081eda221f8a35" 2025-08-14T21:21:45.7607529Z }, 2025-08-14T21:21:45.7607710Z { 2025-08-14T21:21:45.7608168Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7608550Z "size": 1710, 2025-08-14T21:21:45.7608946Z "digest": "sha256:ec63cf4c8a9576f30b4aa4e056589edec178adb9e1bd4fae7e30db8ae931e79b" 2025-08-14T21:21:45.7609385Z }, 2025-08-14T21:21:45.7609563Z { 2025-08-14T21:21:45.7609869Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7610256Z "size": 724, 2025-08-14T21:21:45.7610627Z "digest": "sha256:403909ca2e26cbc4c01ed56f3d9d7068988093b3f8f4f48660469264b731dc40" 2025-08-14T21:21:45.7611054Z }, 2025-08-14T21:21:45.7611234Z { 2025-08-14T21:21:45.7611543Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7611933Z "size": 544, 2025-08-14T21:21:45.7612313Z "digest": "sha256:bf535c931347a635c90c6d6ad9512e91ac39f2a5b184b132d298a030e68ec10b" 2025-08-14T21:21:45.7612750Z }, 2025-08-14T21:21:45.7612932Z { 2025-08-14T21:21:45.7613253Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7613644Z "size": 3243786593, 2025-08-14T21:21:45.7614035Z "digest": "sha256:545d12c0210f80f01e4caa8578b32f2912fa4ef7ba26c357072baaf2dc5af8aa" 2025-08-14T21:21:45.7614461Z }, 2025-08-14T21:21:45.7614645Z { 2025-08-14T21:21:45.7614944Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7615324Z "size": 32, 2025-08-14T21:21:45.7615708Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-08-14T21:21:45.7616135Z }, 2025-08-14T21:21:45.7616320Z { 2025-08-14T21:21:45.7616629Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7617003Z "size": 380, 2025-08-14T21:21:45.7617380Z "digest": "sha256:b19f75074c268f9b0e74e8d5d2e023edf682a8a6c87119e3371f8613f4194c58" 2025-08-14T21:21:45.7617803Z }, 2025-08-14T21:21:45.7617988Z { 2025-08-14T21:21:45.7618292Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7618679Z "size": 234125, 2025-08-14T21:21:45.7619067Z "digest": "sha256:f9818da262fd5fa3d67ad004699e6e7903f4c0c7d0e9fb369070fea11e9f1d63" 2025-08-14T21:21:45.7619487Z }, 2025-08-14T21:21:45.7619671Z { 2025-08-14T21:21:45.7619978Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7620351Z "size": 232, 2025-08-14T21:21:45.7620732Z "digest": "sha256:a4c22264daca8b4c9518e8469b6d093f0248269b76b1d8c604f91a3c320a28e3" 2025-08-14T21:21:45.7621155Z }, 2025-08-14T21:21:45.7621330Z { 2025-08-14T21:21:45.7621637Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7622026Z "size": 4483468, 2025-08-14T21:21:45.7622402Z "digest": "sha256:74df148c3c95c323564cb3c89a52a2a2fd3622f3feb5232771ba952ee33977ae" 2025-08-14T21:21:45.7622826Z }, 2025-08-14T21:21:45.7623005Z { 2025-08-14T21:21:45.7623311Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7623692Z "size": 1865, 2025-08-14T21:21:45.7624087Z "digest": "sha256:9064a8aede51afb66b1fc77446786af4ae1664cdee81eda4e4d5f9fee9dac32c" 2025-08-14T21:21:45.7624530Z }, 2025-08-14T21:21:45.7624704Z { 2025-08-14T21:21:45.7625016Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7625399Z "size": 474, 2025-08-14T21:21:45.7625660Z + exit 0 2025-08-14T21:21:45.7626030Z "digest": "sha256:73685becdb22d94ff1dffa640198d931bdba7721cd3403a8a9cd61b3595cef86" 2025-08-14T21:21:45.7626470Z }, 2025-08-14T21:21:45.7626641Z { 2025-08-14T21:21:45.7626948Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7627336Z "size": 179, 2025-08-14T21:21:45.7627715Z "digest": "sha256:b392343e1c1b34a2657cd0b1192ddcb075146835baed5e1435cb757e4a830d89" 2025-08-14T21:21:45.7628135Z }, 2025-08-14T21:21:45.7628314Z { 2025-08-14T21:21:45.7628622Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7629083Z "size": 588, 2025-08-14T21:21:45.7629557Z "digest": "sha256:68e364b192f0fc6db3f2de7c852a0b11a064eb03ebffa295a42f87dc9cf2afb5" 2025-08-14T21:21:45.7629993Z }, 2025-08-14T21:21:45.7630167Z { 2025-08-14T21:21:45.7630483Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7630872Z "size": 7847064498, 2025-08-14T21:21:45.7631272Z "digest": "sha256:12a76fc2e772cef36083b2f6e93b2b69bdb57b9f65db497336dcfffd44845a6e" 2025-08-14T21:21:45.7631712Z }, 2025-08-14T21:21:45.7631895Z { 2025-08-14T21:21:45.7632203Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7632581Z "size": 802, 2025-08-14T21:21:45.7632960Z "digest": "sha256:75d5be9188851ab271def43a9fcf905e0a1f8827ba7e8081b8efbcc048e2dd4c" 2025-08-14T21:21:45.7633400Z }, 2025-08-14T21:21:45.7633572Z { 2025-08-14T21:21:45.7633885Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7634274Z "size": 32687366, 2025-08-14T21:21:45.7634705Z "digest": "sha256:5043d38caf21163d48696c3248233dd5db1a2ec62820bf6c0260e15d7049ad3e" 2025-08-14T21:21:45.7635135Z }, 2025-08-14T21:21:45.7635324Z { 2025-08-14T21:21:45.7635617Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7636007Z "size": 104, 2025-08-14T21:21:45.7636392Z "digest": "sha256:b95eb43b4a658498a7b72605a864dfe6c64367835be5dab3bfe4e32876307312" 2025-08-14T21:21:45.7636824Z }, 2025-08-14T21:21:45.7636999Z { 2025-08-14T21:21:45.7637305Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7637702Z "size": 1494, 2025-08-14T21:21:45.7638095Z "digest": "sha256:ecf009cc06f45d09c07e0c2872de4e94efbfb6e8baea2600b9c1cf88a471012b" 2025-08-14T21:21:45.7638531Z }, 2025-08-14T21:21:45.7638701Z { 2025-08-14T21:21:45.7639000Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7639387Z "size": 453560485, 2025-08-14T21:21:45.7639807Z "digest": "sha256:ec4bce64c4a92a8d8955232324b8ba2c095fe95830c755ea5f9fd0eb4a7cb837" 2025-08-14T21:21:45.7640230Z }, 2025-08-14T21:21:45.7640406Z { 2025-08-14T21:21:45.7640709Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7641097Z "size": 163, 2025-08-14T21:21:45.7641483Z "digest": "sha256:5e0f1f822a8cf7f9f6b711baff5d66b6a4fd2d4ae4ff07396c1087ed0c1118ec" 2025-08-14T21:21:45.7641917Z }, 2025-08-14T21:21:45.7642094Z { 2025-08-14T21:21:45.7642398Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7642779Z "size": 347, 2025-08-14T21:21:45.7643166Z "digest": "sha256:1ec8cdc0e8957a770bd4849b528d5b4808c9b053fe17b460ca0bb0bb33ec5713" 2025-08-14T21:21:45.7643586Z }, 2025-08-14T21:21:45.7643763Z { 2025-08-14T21:21:45.7644069Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7644548Z "size": 32, 2025-08-14T21:21:45.7644981Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-08-14T21:21:45.7645418Z }, 2025-08-14T21:21:45.7645587Z { 2025-08-14T21:21:45.7645893Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7646268Z "size": 106, 2025-08-14T21:21:45.7646634Z "digest": "sha256:727e61c4e533150ac954bee7ec6a856202e673bec2699612256943080187945b" 2025-08-14T21:21:45.7647339Z }, 2025-08-14T21:21:45.7647529Z { 2025-08-14T21:21:45.7647835Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7648211Z "size": 425, 2025-08-14T21:21:45.7648594Z "digest": "sha256:a93490f9df7fab243bdf00e0ac93fc0939de54b832bb301b56239857416120fe" 2025-08-14T21:21:45.7649024Z }, 2025-08-14T21:21:45.7649195Z { 2025-08-14T21:21:45.7649501Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7649882Z "size": 19308711, 2025-08-14T21:21:45.7650275Z "digest": "sha256:b3f88809dbfe865790cfe44ebd4bc49b8900ce8e45cfabfc91298a87fc4c929b" 2025-08-14T21:21:45.7650847Z }, 2025-08-14T21:21:45.7651138Z { 2025-08-14T21:21:45.7651441Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7651820Z "size": 108, 2025-08-14T21:21:45.7652189Z "digest": "sha256:5590b432104632f37e510dbd8cfdc535239b2d90242b3a466a1e57d80c12a9e0" 2025-08-14T21:21:45.7652613Z }, 2025-08-14T21:21:45.7652784Z { 2025-08-14T21:21:45.7653087Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7653466Z "size": 692, 2025-08-14T21:21:45.7653831Z "digest": "sha256:740bfb2f3383667f227c74c0a618ac0759edaf2a5938793f26655e7c06006174" 2025-08-14T21:21:45.7654250Z }, 2025-08-14T21:21:45.7654445Z { 2025-08-14T21:21:45.7654767Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7655149Z "size": 724, 2025-08-14T21:21:45.7655515Z "digest": "sha256:403909ca2e26cbc4c01ed56f3d9d7068988093b3f8f4f48660469264b731dc40" 2025-08-14T21:21:45.7655933Z }, 2025-08-14T21:21:45.7656110Z { 2025-08-14T21:21:45.7656420Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7656806Z "size": 118, 2025-08-14T21:21:45.7657191Z "digest": "sha256:d65a0285ce4046ac2df3cbe54a73d66f645e234756eb9911ea9dfe6c76c44404" 2025-08-14T21:21:45.7657617Z }, 2025-08-14T21:21:45.7657790Z { 2025-08-14T21:21:45.7658087Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7658464Z "size": 136, 2025-08-14T21:21:45.7658843Z "digest": "sha256:2c6c790a94e289a2fc748b46bdc193d6dbd78e2298b99fda2a75fa0a2337e4df" 2025-08-14T21:21:45.7659261Z }, 2025-08-14T21:21:45.7659437Z { 2025-08-14T21:21:45.7659740Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7660112Z "size": 139, 2025-08-14T21:21:45.7660492Z "digest": "sha256:9d95dba818e54615db10af94ca9f60886564e4aa1e12da89e533f6842ce1d619" 2025-08-14T21:21:45.7660915Z }, 2025-08-14T21:21:45.7661092Z { 2025-08-14T21:21:45.7661398Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7661774Z "size": 32, 2025-08-14T21:21:45.7662156Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-08-14T21:21:45.7662578Z }, 2025-08-14T21:21:45.7662748Z { 2025-08-14T21:21:45.7663049Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7663423Z "size": 211, 2025-08-14T21:21:45.7663804Z "digest": "sha256:4381ffc02faf2a515f8b713123a2f4ada8b364f2231ccbe13cdadae61308b7cf" 2025-08-14T21:21:45.7664241Z }, 2025-08-14T21:21:45.7664412Z { 2025-08-14T21:21:45.7664714Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7665093Z "size": 272997052, 2025-08-14T21:21:45.7665486Z "digest": "sha256:157c17059e69d86f00abf09db154a6f7d1711e5c69bac623cde6889edf9e3f55" 2025-08-14T21:21:45.7665916Z }, 2025-08-14T21:21:45.7666097Z { 2025-08-14T21:21:45.7666402Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7666795Z "size": 3085176703, 2025-08-14T21:21:45.7667192Z "digest": "sha256:5a57159127754e141ede2b31c509e96b590152c5a531c67d8519edfc4d7e9470" 2025-08-14T21:21:45.7667612Z }, 2025-08-14T21:21:45.7667786Z { 2025-08-14T21:21:45.7668091Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7668477Z "size": 129, 2025-08-14T21:21:45.7668858Z "digest": "sha256:3d4b192e6130aab73a9b1fabb8015a44c8e58ad7c107414701113ac3b34b36c2" 2025-08-14T21:21:45.7669296Z }, 2025-08-14T21:21:45.7669478Z { 2025-08-14T21:21:45.7669780Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7670161Z "size": 776, 2025-08-14T21:21:45.7670540Z "digest": "sha256:1aef221cbe7de18f780f9a5f4418bc276176bc252900038fd21d808da03746fd" 2025-08-14T21:21:45.7670960Z }, 2025-08-14T21:21:45.7671141Z { 2025-08-14T21:21:45.7671453Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7671923Z "size": 724, 2025-08-14T21:21:45.7672398Z "digest": "sha256:403909ca2e26cbc4c01ed56f3d9d7068988093b3f8f4f48660469264b731dc40" 2025-08-14T21:21:45.7672824Z }, 2025-08-14T21:21:45.7673009Z { 2025-08-14T21:21:45.7673306Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7673690Z "size": 140, 2025-08-14T21:21:45.7674065Z "digest": "sha256:6c6566b1f69c877378ed64382ce010f3c25393e483e537a87a3a68f99dfff735" 2025-08-14T21:21:45.7674500Z }, 2025-08-14T21:21:45.7674703Z { 2025-08-14T21:21:45.7675015Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7675386Z "size": 32, 2025-08-14T21:21:45.7675766Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-08-14T21:21:45.7676201Z }, 2025-08-14T21:21:45.7676373Z { 2025-08-14T21:21:45.7676690Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7677083Z "size": 160, 2025-08-14T21:21:45.7677466Z "digest": "sha256:5629b6e108ff5911bd75613d87aa2b0e67c38742c7b024a7c82c2a20727b2285" 2025-08-14T21:21:45.7677875Z }, 2025-08-14T21:21:45.7678052Z { 2025-08-14T21:21:45.7678358Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7678729Z "size": 1011, 2025-08-14T21:21:45.7679113Z "digest": "sha256:3d11a12afe792395a89d4ea7af3a42cd11b9acc8601bf41a3b212f7701a9052f" 2025-08-14T21:21:45.7679536Z }, 2025-08-14T21:21:45.7679705Z { 2025-08-14T21:21:45.7680007Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7680380Z "size": 724, 2025-08-14T21:21:45.7680743Z "digest": "sha256:403909ca2e26cbc4c01ed56f3d9d7068988093b3f8f4f48660469264b731dc40" 2025-08-14T21:21:45.7681160Z }, 2025-08-14T21:21:45.7681336Z { 2025-08-14T21:21:45.7681631Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7682004Z "size": 135, 2025-08-14T21:21:45.7682396Z "digest": "sha256:7f43e20df7d5a1fb426a75050e9b9443267fd816b5fe29e48c6ded748431ecce" 2025-08-14T21:21:45.7682817Z }, 2025-08-14T21:21:45.7682986Z { 2025-08-14T21:21:45.7683288Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7683667Z "size": 32, 2025-08-14T21:21:45.7684039Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-08-14T21:21:45.7684530Z }, 2025-08-14T21:21:45.7684734Z { 2025-08-14T21:21:45.7685060Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7685440Z "size": 159, 2025-08-14T21:21:45.7685821Z "digest": "sha256:1f1ad77e7661a1f0881bc680e6793feccfb50f8c57b805b67e26b03584f4a420" 2025-08-14T21:21:45.7686242Z }, 2025-08-14T21:21:45.7686421Z { 2025-08-14T21:21:45.7686730Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7687111Z "size": 1370, 2025-08-14T21:21:45.7687490Z "digest": "sha256:2b28f013a7e0cb7b2880907b423beaf9f1aa56a98075c5cd3b67c1304d101107" 2025-08-14T21:21:45.7687924Z }, 2025-08-14T21:21:45.7688110Z { 2025-08-14T21:21:45.7688408Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7688792Z "size": 32, 2025-08-14T21:21:45.7689172Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-08-14T21:21:45.7689593Z }, 2025-08-14T21:21:45.7689771Z { 2025-08-14T21:21:45.7690076Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7690448Z "size": 137, 2025-08-14T21:21:45.7690834Z "digest": "sha256:f2553a813bfdf963aa412a268cae6ecec1c041d0d34b19485f3771177454a45d" 2025-08-14T21:21:45.7691266Z }, 2025-08-14T21:21:45.7691442Z { 2025-08-14T21:21:45.7691753Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7692132Z "size": 380, 2025-08-14T21:21:45.7692522Z "digest": "sha256:920afab8d80cf9d21d7440a3cc2ff53dd6edec7dcd6468ec06000976a0a8c050" 2025-08-14T21:21:45.7693043Z }, 2025-08-14T21:21:45.7693225Z { 2025-08-14T21:21:45.7693612Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7693990Z "size": 32, 2025-08-14T21:21:45.7694378Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-08-14T21:21:45.7694815Z }, 2025-08-14T21:21:45.7694990Z { 2025-08-14T21:21:45.7695299Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7695680Z "size": 104, 2025-08-14T21:21:45.7696055Z "digest": "sha256:49bf96108822afb250e2eb64eec419d15a30c113424e9cd32e1165510c3ecbe8" 2025-08-14T21:21:45.7696481Z }, 2025-08-14T21:21:45.7696661Z { 2025-08-14T21:21:45.7696968Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7697348Z "size": 408, 2025-08-14T21:21:45.7697723Z "digest": "sha256:41a88d2df1710ac9563f2c31e12f1f5055e3fb948fb23be94056a92f9bcd23cc" 2025-08-14T21:21:45.7698149Z }, 2025-08-14T21:21:45.7698339Z { 2025-08-14T21:21:45.7698648Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7699031Z "size": 32, 2025-08-14T21:21:45.7699413Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-08-14T21:21:45.7699842Z }, 2025-08-14T21:21:45.7700027Z { 2025-08-14T21:21:45.7700336Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7700720Z "size": 109, 2025-08-14T21:21:45.7701110Z "digest": "sha256:ac86414debcc5d65ada0c2f4cac431d92fee6674eaae5588900bbeccba717532" 2025-08-14T21:21:45.7701554Z }, 2025-08-14T21:21:45.7701735Z { 2025-08-14T21:21:45.7702038Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7702419Z "size": 1896, 2025-08-14T21:21:45.7702808Z "digest": "sha256:41e0d573c7e6286174db6b451adbb436dbff25b2f486fb73933d501b36a772c8" 2025-08-14T21:21:45.7703230Z }, 2025-08-14T21:21:45.7703413Z { 2025-08-14T21:21:45.7703730Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7704113Z "size": 242635943, 2025-08-14T21:21:45.7704519Z "digest": "sha256:cd44cde965e2ab43b7122b93bdd7c8d2fff67a48fb5740dd56683144f9ccaf61" 2025-08-14T21:21:45.7704953Z }, 2025-08-14T21:21:45.7705132Z { 2025-08-14T21:21:45.7705446Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7705833Z "size": 106, 2025-08-14T21:21:45.7706216Z "digest": "sha256:53fa29ec5ecf82fea214063654688a04cf6152402983464e63e925cdbeadfe74" 2025-08-14T21:21:45.7706637Z }, 2025-08-14T21:21:45.7706823Z { 2025-08-14T21:21:45.7707137Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7707515Z "size": 165, 2025-08-14T21:21:45.7707898Z "digest": "sha256:672e0afb1fbc6078be7530e68ff59643e7a503935a9a45e11e1a7fdbe557cd87" 2025-08-14T21:21:45.7708337Z }, 2025-08-14T21:21:45.7708515Z { 2025-08-14T21:21:45.7708835Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7709226Z "size": 7943, 2025-08-14T21:21:45.7709614Z "digest": "sha256:2fb460415c5c60fabaa667070a9d859ca238ef2cb49b724c6c8aac3fdfa9525b" 2025-08-14T21:21:45.7710053Z }, 2025-08-14T21:21:45.7710236Z { 2025-08-14T21:21:45.7710539Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7710919Z "size": 8071, 2025-08-14T21:21:45.7711294Z "digest": "sha256:987f67564697f5257d3ead5b348038481241b157c9cb87f79406c00818df165c" 2025-08-14T21:21:45.7711710Z }, 2025-08-14T21:21:45.7711886Z { 2025-08-14T21:21:45.7712199Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7712579Z "size": 303, 2025-08-14T21:21:45.7712969Z "digest": "sha256:ac6d9eb82fcfb8dd8be4e3054a3f85c7a60eb5a9087f68234ffb8fd6fd0d8cbe" 2025-08-14T21:21:45.7713413Z }, 2025-08-14T21:21:45.7713592Z { 2025-08-14T21:21:45.7713900Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7714382Z "size": 13360626, 2025-08-14T21:21:45.7714869Z "digest": "sha256:92b42b4592d640915ccc8ef9660c1b4bc0d0bbaa751ade2e5bd966e3a13441cd" 2025-08-14T21:21:45.7715302Z }, 2025-08-14T21:21:45.7715484Z { 2025-08-14T21:21:45.7715801Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7716188Z "size": 108, 2025-08-14T21:21:45.7716576Z "digest": "sha256:d87965a1f59773c0ad96d6355acc359bc012fd0d4b6da6f32acdbabf558a947e" 2025-08-14T21:21:45.7717006Z }, 2025-08-14T21:21:45.7717194Z { 2025-08-14T21:21:45.7717495Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7717879Z "size": 54145663, 2025-08-14T21:21:45.7718268Z "digest": "sha256:bf708065b04653ac96bc80b81fd80c77f7d7c804f2671f24502b4b201af869a3" 2025-08-14T21:21:45.7718683Z }, 2025-08-14T21:21:45.7718872Z { 2025-08-14T21:21:45.7719182Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-08-14T21:21:45.7719557Z "size": 32, 2025-08-14T21:21:45.7719966Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-08-14T21:21:45.7720405Z } 2025-08-14T21:21:45.7720585Z ] 2025-08-14T21:21:45.7720777Z } 2025-08-14T21:21:45.7766284Z ##[group]Run set -eux 2025-08-14T21:21:45.7766544Z set -eux 2025-08-14T21:21:45.7767336Z aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token | jq --raw-output '.SecretString' | jq -r .docker_hub_readonly_token | docker login --username pytorchbot --password-stdin 2025-08-14T21:21:45.7778696Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:21:45.7779033Z env: 2025-08-14T21:21:45.7779239Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:21:45.7779488Z ##[endgroup] 2025-08-14T21:21:45.7813531Z + aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token 2025-08-14T21:21:45.7814603Z + jq --raw-output .SecretString 2025-08-14T21:21:45.7815786Z + jq -r .docker_hub_readonly_token 2025-08-14T21:21:45.7816964Z + docker login --username pytorchbot --password-stdin 2025-08-14T21:21:46.3538781Z WARNING! Your password will be stored unencrypted in /home/ec2-user/.docker/config.json. 2025-08-14T21:21:46.3539347Z Configure a credential helper to remove this warning. See 2025-08-14T21:21:46.3539896Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2025-08-14T21:21:46.3540258Z 2025-08-14T21:21:46.3540357Z Login Succeeded 2025-08-14T21:21:46.3667084Z ##[group]Run tag=${ECR_DOCKER_IMAGE##*:} 2025-08-14T21:21:46.3667445Z tag=${ECR_DOCKER_IMAGE##*:} 2025-08-14T21:21:46.3667856Z echo "docker pull ghcr.io/pytorch/ci-image:${tag/:/-}" 2025-08-14T21:21:46.3676888Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:21:46.3677241Z env: 2025-08-14T21:21:46.3677450Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:21:46.3678220Z ECR_DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe 2025-08-14T21:21:46.3679023Z ##[endgroup] 2025-08-14T21:21:46.3710971Z docker pull ghcr.io/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe 2025-08-14T21:21:46.3766577Z ##[group]Run pytorch/test-infra/.github/actions/pull-docker-image@main 2025-08-14T21:21:46.3766981Z with: 2025-08-14T21:21:46.3767689Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe 2025-08-14T21:21:46.3768556Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-08-14T21:21:46.3768913Z env: 2025-08-14T21:21:46.3769115Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:21:46.3769365Z ##[endgroup] 2025-08-14T21:21:46.3793253Z ##[group]Run set -x 2025-08-14T21:21:46.3793513Z set -x 2025-08-14T21:21:46.3793727Z set +e 2025-08-14T21:21:46.3793934Z  2025-08-14T21:21:46.3794127Z login() { 2025-08-14T21:21:46.3794782Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2025-08-14T21:21:46.3795256Z } 2025-08-14T21:21:46.3795447Z  2025-08-14T21:21:46.3795674Z retry () { 2025-08-14T21:21:46.3795931Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2025-08-14T21:21:46.3796224Z } 2025-08-14T21:21:46.3796421Z  2025-08-14T21:21:46.3796643Z retry login "${DOCKER_REGISTRY}" 2025-08-14T21:21:46.3796921Z  2025-08-14T21:21:46.3797365Z IMAGE_SIZE=$(docker manifest inspect "${DOCKER_IMAGE}" | jq '[.layers[].size, .config.size] | add / 1024 / 1024') 2025-08-14T21:21:46.3797962Z echo "Compressed size of image in MB: ${IMAGE_SIZE}" 2025-08-14T21:21:46.3798298Z  2025-08-14T21:21:46.3798495Z set -e 2025-08-14T21:21:46.3798815Z # ignore output since only exit code is used for conditional 2025-08-14T21:21:46.3799265Z # only pull docker image if it's not available locally 2025-08-14T21:21:46.3799764Z if ! docker inspect --type=image "${DOCKER_IMAGE}" >/dev/null 2>/dev/null; then 2025-08-14T21:21:46.3800227Z  retry docker pull "${DOCKER_IMAGE}" 2025-08-14T21:21:46.3800524Z fi 2025-08-14T21:21:46.3809733Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:21:46.3810086Z env: 2025-08-14T21:21:46.3810290Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:21:46.3811038Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe 2025-08-14T21:21:46.3811908Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-08-14T21:21:46.3812265Z ##[endgroup] 2025-08-14T21:21:46.3842717Z + set +e 2025-08-14T21:21:46.3843312Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-08-14T21:21:46.3843737Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-08-14T21:21:46.3847474Z + aws ecr get-login-password --region us-east-1 2025-08-14T21:21:46.3848411Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-08-14T21:21:46.9106443Z WARNING! Your password will be stored unencrypted in /home/ec2-user/.docker/config.json. 2025-08-14T21:21:46.9107161Z Configure a credential helper to remove this warning. See 2025-08-14T21:21:46.9107712Z https://docs.docker.com/engine/reference/commandline/login/#credentials-store 2025-08-14T21:21:46.9108084Z 2025-08-14T21:21:46.9108791Z Login Succeeded 2025-08-14T21:21:46.9144402Z ++ docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe 2025-08-14T21:21:46.9145288Z ++ jq '[.layers[].size, .config.size] | add / 1024 / 1024' 2025-08-14T21:21:47.1742041Z + IMAGE_SIZE=14995.185963630676 2025-08-14T21:21:47.1742397Z + echo 'Compressed size of image in MB: 14995.185963630676' 2025-08-14T21:21:47.1742769Z + set -e 2025-08-14T21:21:47.1743027Z Compressed size of image in MB: 14995.185963630676 2025-08-14T21:21:47.1744167Z + docker inspect --type=image 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe 2025-08-14T21:21:47.1907716Z + retry docker pull 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe 2025-08-14T21:21:47.1909138Z + docker pull 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe 2025-08-14T21:21:47.4408655Z pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe: Pulling from pytorch/ci-image 2025-08-14T21:21:47.4409646Z 660ffc76f83b: Pulling fs layer 2025-08-14T21:21:47.4410038Z d6b6067d541b: Pulling fs layer 2025-08-14T21:21:47.4410420Z 3006f113d37e: Pulling fs layer 2025-08-14T21:21:47.4411861Z 4cddbecee055: Pulling fs layer 2025-08-14T21:21:47.4412137Z 72e3bb031d09: Pulling fs layer 2025-08-14T21:21:47.4412406Z 3f60d44af11c: Pulling fs layer 2025-08-14T21:21:47.4412782Z a80f570fd810: Pulling fs layer 2025-08-14T21:21:47.4413061Z 3b13dc7bd76c: Pulling fs layer 2025-08-14T21:21:47.4413321Z 33e4e5b090ae: Pulling fs layer 2025-08-14T21:21:47.4413669Z 322c959d4168: Pulling fs layer 2025-08-14T21:21:47.4413964Z ec63cf4c8a95: Pulling fs layer 2025-08-14T21:21:47.4414220Z 403909ca2e26: Pulling fs layer 2025-08-14T21:21:47.4414481Z bf535c931347: Pulling fs layer 2025-08-14T21:21:47.4414744Z 545d12c0210f: Pulling fs layer 2025-08-14T21:21:47.4415000Z 4f4fb700ef54: Pulling fs layer 2025-08-14T21:21:47.4415264Z b19f75074c26: Pulling fs layer 2025-08-14T21:21:47.4415615Z f9818da262fd: Pulling fs layer 2025-08-14T21:21:47.4415882Z a4c22264daca: Pulling fs layer 2025-08-14T21:21:47.4416145Z 74df148c3c95: Pulling fs layer 2025-08-14T21:21:47.4416410Z 9064a8aede51: Pulling fs layer 2025-08-14T21:21:47.4416676Z 73685becdb22: Pulling fs layer 2025-08-14T21:21:47.4416958Z b392343e1c1b: Pulling fs layer 2025-08-14T21:21:47.4417329Z 68e364b192f0: Pulling fs layer 2025-08-14T21:21:47.4417719Z 12a76fc2e772: Pulling fs layer 2025-08-14T21:21:47.4418075Z 75d5be918885: Pulling fs layer 2025-08-14T21:21:47.4418460Z 5043d38caf21: Pulling fs layer 2025-08-14T21:21:47.4418723Z b95eb43b4a65: Pulling fs layer 2025-08-14T21:21:47.4418987Z ecf009cc06f4: Pulling fs layer 2025-08-14T21:21:47.4419335Z ec4bce64c4a9: Pulling fs layer 2025-08-14T21:21:47.4419673Z 5e0f1f822a8c: Pulling fs layer 2025-08-14T21:21:47.4420043Z 1ec8cdc0e895: Pulling fs layer 2025-08-14T21:21:47.4420301Z 3f60d44af11c: Waiting 2025-08-14T21:21:47.4420541Z 727e61c4e533: Pulling fs layer 2025-08-14T21:21:47.4420794Z a93490f9df7f: Pulling fs layer 2025-08-14T21:21:47.4421056Z b3f88809dbfe: Pulling fs layer 2025-08-14T21:21:47.4421317Z 5590b4321046: Pulling fs layer 2025-08-14T21:21:47.4421578Z 740bfb2f3383: Pulling fs layer 2025-08-14T21:21:47.4421837Z d65a0285ce40: Pulling fs layer 2025-08-14T21:21:47.4422155Z 2c6c790a94e2: Pulling fs layer 2025-08-14T21:21:47.4422448Z 9d95dba818e5: Pulling fs layer 2025-08-14T21:21:47.4422703Z 4381ffc02faf: Pulling fs layer 2025-08-14T21:21:47.4422967Z 157c17059e69: Pulling fs layer 2025-08-14T21:21:47.4423234Z 5a5715912775: Pulling fs layer 2025-08-14T21:21:47.4423484Z 3d4b192e6130: Pulling fs layer 2025-08-14T21:21:47.4423742Z 1aef221cbe7d: Pulling fs layer 2025-08-14T21:21:47.4424001Z 3b13dc7bd76c: Waiting 2025-08-14T21:21:47.4424233Z 6c6566b1f69c: Pulling fs layer 2025-08-14T21:21:47.4424479Z 322c959d4168: Waiting 2025-08-14T21:21:47.4424754Z 5629b6e108ff: Pulling fs layer 2025-08-14T21:21:47.4425014Z 3d11a12afe79: Pulling fs layer 2025-08-14T21:21:47.4425265Z 7f43e20df7d5: Pulling fs layer 2025-08-14T21:21:47.4425603Z 1f1ad77e7661: Pulling fs layer 2025-08-14T21:21:47.4425884Z 2b28f013a7e0: Pulling fs layer 2025-08-14T21:21:47.4426137Z f2553a813bfd: Pulling fs layer 2025-08-14T21:21:47.4426398Z 920afab8d80c: Pulling fs layer 2025-08-14T21:21:47.4426662Z 49bf96108822: Pulling fs layer 2025-08-14T21:21:47.4426921Z 41a88d2df171: Pulling fs layer 2025-08-14T21:21:47.4427175Z ac86414debcc: Pulling fs layer 2025-08-14T21:21:47.4427646Z 41e0d573c7e6: Pulling fs layer 2025-08-14T21:21:47.4427923Z cd44cde965e2: Pulling fs layer 2025-08-14T21:21:47.4428181Z 53fa29ec5ecf: Pulling fs layer 2025-08-14T21:21:47.4428450Z 672e0afb1fbc: Pulling fs layer 2025-08-14T21:21:47.4428713Z 2fb460415c5c: Pulling fs layer 2025-08-14T21:21:47.4429032Z 987f67564697: Pulling fs layer 2025-08-14T21:21:47.4429412Z ac6d9eb82fcf: Pulling fs layer 2025-08-14T21:21:47.4429722Z 92b42b4592d6: Pulling fs layer 2025-08-14T21:21:47.4429975Z d87965a1f597: Pulling fs layer 2025-08-14T21:21:47.4430236Z bf708065b046: Pulling fs layer 2025-08-14T21:21:47.4430484Z 7f43e20df7d5: Waiting 2025-08-14T21:21:47.4430706Z 72e3bb031d09: Waiting 2025-08-14T21:21:47.4430934Z 1f1ad77e7661: Waiting 2025-08-14T21:21:47.4431165Z 2b28f013a7e0: Waiting 2025-08-14T21:21:47.4431604Z f2553a813bfd: Waiting 2025-08-14T21:21:47.4431926Z ec63cf4c8a95: Waiting 2025-08-14T21:21:47.4432251Z 4cddbecee055: Waiting 2025-08-14T21:21:47.4432563Z 403909ca2e26: Waiting 2025-08-14T21:21:47.4432892Z 920afab8d80c: Waiting 2025-08-14T21:21:47.4433259Z 9064a8aede51: Waiting 2025-08-14T21:21:47.4433622Z b392343e1c1b: Waiting 2025-08-14T21:21:47.4433937Z a80f570fd810: Waiting 2025-08-14T21:21:47.4434253Z 33e4e5b090ae: Waiting 2025-08-14T21:21:47.4434521Z d65a0285ce40: Waiting 2025-08-14T21:21:47.4434735Z 73685becdb22: Waiting 2025-08-14T21:21:47.4434967Z 4381ffc02faf: Waiting 2025-08-14T21:21:47.4435232Z b19f75074c26: Waiting 2025-08-14T21:21:47.4435450Z 157c17059e69: Waiting 2025-08-14T21:21:47.4435662Z 5a5715912775: Waiting 2025-08-14T21:21:47.4435878Z 4f4fb700ef54: Waiting 2025-08-14T21:21:47.4436093Z f9818da262fd: Waiting 2025-08-14T21:21:47.4436315Z ecf009cc06f4: Waiting 2025-08-14T21:21:47.4436534Z 41a88d2df171: Waiting 2025-08-14T21:21:47.4436799Z a4c22264daca: Waiting 2025-08-14T21:21:47.4437102Z 3d4b192e6130: Waiting 2025-08-14T21:21:47.4437386Z 74df148c3c95: Waiting 2025-08-14T21:21:47.4437630Z 68e364b192f0: Waiting 2025-08-14T21:21:47.4437849Z 9d95dba818e5: Waiting 2025-08-14T21:21:47.4438072Z b95eb43b4a65: Waiting 2025-08-14T21:21:47.4438294Z bf535c931347: Waiting 2025-08-14T21:21:47.4438522Z 545d12c0210f: Waiting 2025-08-14T21:21:47.4438741Z 41e0d573c7e6: Waiting 2025-08-14T21:21:47.4438954Z 5590b4321046: Waiting 2025-08-14T21:21:47.4439174Z 3d11a12afe79: Waiting 2025-08-14T21:21:47.4439394Z cd44cde965e2: Waiting 2025-08-14T21:21:47.4439617Z 987f67564697: Waiting 2025-08-14T21:21:47.4439827Z 5629b6e108ff: Waiting 2025-08-14T21:21:47.4440048Z 2fb460415c5c: Waiting 2025-08-14T21:21:47.4440274Z 2c6c790a94e2: Waiting 2025-08-14T21:21:47.4440488Z a93490f9df7f: Waiting 2025-08-14T21:21:47.4440712Z 5e0f1f822a8c: Waiting 2025-08-14T21:21:47.4440939Z bf708065b046: Waiting 2025-08-14T21:21:47.4441160Z b3f88809dbfe: Waiting 2025-08-14T21:21:47.4441388Z 672e0afb1fbc: Waiting 2025-08-14T21:21:47.4441614Z 92b42b4592d6: Waiting 2025-08-14T21:21:47.4441830Z 5043d38caf21: Waiting 2025-08-14T21:21:47.4442068Z ac6d9eb82fcf: Waiting 2025-08-14T21:21:47.4442375Z d87965a1f597: Waiting 2025-08-14T21:21:47.4442596Z 75d5be918885: Waiting 2025-08-14T21:21:47.4442916Z ac86414debcc: Waiting 2025-08-14T21:21:47.4443237Z 6c6566b1f69c: Waiting 2025-08-14T21:21:47.4443549Z 49bf96108822: Waiting 2025-08-14T21:21:47.4443874Z 727e61c4e533: Waiting 2025-08-14T21:21:47.4444240Z ec4bce64c4a9: Waiting 2025-08-14T21:21:47.4444638Z 740bfb2f3383: Waiting 2025-08-14T21:21:47.4444875Z 1aef221cbe7d: Waiting 2025-08-14T21:21:47.4445133Z 12a76fc2e772: Waiting 2025-08-14T21:21:47.4445380Z 1ec8cdc0e895: Waiting 2025-08-14T21:21:47.5271026Z d6b6067d541b: Verifying Checksum 2025-08-14T21:21:47.5271357Z d6b6067d541b: Download complete 2025-08-14T21:21:47.6148359Z 4cddbecee055: Download complete 2025-08-14T21:21:47.7140244Z 72e3bb031d09: Download complete 2025-08-14T21:21:47.7905050Z 660ffc76f83b: Download complete 2025-08-14T21:21:47.7926797Z 3f60d44af11c: Download complete 2025-08-14T21:21:47.8609520Z a80f570fd810: Download complete 2025-08-14T21:21:47.8782448Z 3b13dc7bd76c: Verifying Checksum 2025-08-14T21:21:47.8782854Z 3b13dc7bd76c: Download complete 2025-08-14T21:21:47.9622702Z 322c959d4168: Download complete 2025-08-14T21:21:48.0399568Z ec63cf4c8a95: Verifying Checksum 2025-08-14T21:21:48.0400038Z ec63cf4c8a95: Download complete 2025-08-14T21:21:48.1199703Z 403909ca2e26: Verifying Checksum 2025-08-14T21:21:48.1200035Z 403909ca2e26: Download complete 2025-08-14T21:21:48.2415677Z bf535c931347: Verifying Checksum 2025-08-14T21:21:48.2415998Z bf535c931347: Download complete 2025-08-14T21:21:48.9286725Z 660ffc76f83b: Pull complete 2025-08-14T21:21:48.9514697Z d6b6067d541b: Pull complete 2025-08-14T21:21:49.0345032Z 33e4e5b090ae: Verifying Checksum 2025-08-14T21:21:49.0345352Z 33e4e5b090ae: Download complete 2025-08-14T21:21:49.0425785Z 4f4fb700ef54: Verifying Checksum 2025-08-14T21:21:49.0426087Z 4f4fb700ef54: Download complete 2025-08-14T21:21:49.1178342Z b19f75074c26: Download complete 2025-08-14T21:21:49.1957225Z f9818da262fd: Download complete 2025-08-14T21:21:49.2835647Z a4c22264daca: Verifying Checksum 2025-08-14T21:21:49.2836055Z a4c22264daca: Download complete 2025-08-14T21:21:49.3850342Z 74df148c3c95: Verifying Checksum 2025-08-14T21:21:49.3850735Z 74df148c3c95: Download complete 2025-08-14T21:21:49.4652046Z 9064a8aede51: Verifying Checksum 2025-08-14T21:21:49.4652437Z 9064a8aede51: Download complete 2025-08-14T21:21:49.6044187Z 73685becdb22: Download complete 2025-08-14T21:21:49.6937503Z b392343e1c1b: Download complete 2025-08-14T21:21:49.7775443Z 68e364b192f0: Verifying Checksum 2025-08-14T21:21:49.7775783Z 68e364b192f0: Download complete 2025-08-14T21:21:50.6371809Z 3006f113d37e: Verifying Checksum 2025-08-14T21:21:50.6372137Z 3006f113d37e: Download complete 2025-08-14T21:21:50.7060681Z 75d5be918885: Verifying Checksum 2025-08-14T21:21:50.7060994Z 75d5be918885: Download complete 2025-08-14T21:21:51.0921820Z 5043d38caf21: Verifying Checksum 2025-08-14T21:21:51.0922200Z 5043d38caf21: Download complete 2025-08-14T21:21:51.1704331Z b95eb43b4a65: Verifying Checksum 2025-08-14T21:21:51.1704651Z b95eb43b4a65: Download complete 2025-08-14T21:21:51.2390774Z ecf009cc06f4: Verifying Checksum 2025-08-14T21:21:51.2391083Z ecf009cc06f4: Download complete 2025-08-14T21:21:55.8461594Z ec4bce64c4a9: Verifying Checksum 2025-08-14T21:21:55.8461955Z ec4bce64c4a9: Download complete 2025-08-14T21:21:55.9504400Z 5e0f1f822a8c: Download complete 2025-08-14T21:21:56.0333269Z 1ec8cdc0e895: Verifying Checksum 2025-08-14T21:21:56.0336198Z 1ec8cdc0e895: Download complete 2025-08-14T21:21:56.1117140Z 727e61c4e533: Download complete 2025-08-14T21:21:56.1832357Z a93490f9df7f: Download complete 2025-08-14T21:21:56.4497058Z b3f88809dbfe: Verifying Checksum 2025-08-14T21:21:56.4497488Z b3f88809dbfe: Download complete 2025-08-14T21:21:56.5405481Z 5590b4321046: Verifying Checksum 2025-08-14T21:21:56.5405964Z 5590b4321046: Download complete 2025-08-14T21:21:56.6326621Z 740bfb2f3383: Download complete 2025-08-14T21:21:56.7297889Z d65a0285ce40: Verifying Checksum 2025-08-14T21:21:56.7298244Z d65a0285ce40: Download complete 2025-08-14T21:21:56.8092917Z 2c6c790a94e2: Download complete 2025-08-14T21:21:56.9170578Z 9d95dba818e5: Verifying Checksum 2025-08-14T21:21:56.9170936Z 9d95dba818e5: Download complete 2025-08-14T21:21:57.0076027Z 4381ffc02faf: Verifying Checksum 2025-08-14T21:21:57.0076440Z 4381ffc02faf: Download complete 2025-08-14T21:21:59.7861852Z 157c17059e69: Verifying Checksum 2025-08-14T21:21:59.7862220Z 157c17059e69: Download complete 2025-08-14T21:21:59.9920414Z 3006f113d37e: Pull complete 2025-08-14T21:22:00.1702887Z 4cddbecee055: Pull complete 2025-08-14T21:22:00.3223215Z 72e3bb031d09: Pull complete 2025-08-14T21:22:00.4360737Z 3f60d44af11c: Pull complete 2025-08-14T21:22:00.5187708Z a80f570fd810: Pull complete 2025-08-14T21:22:00.6955374Z 3b13dc7bd76c: Pull complete 2025-08-14T21:22:03.2971444Z 33e4e5b090ae: Pull complete 2025-08-14T21:22:03.4636887Z 322c959d4168: Pull complete 2025-08-14T21:22:03.6058280Z ec63cf4c8a95: Pull complete 2025-08-14T21:22:03.7419452Z 403909ca2e26: Pull complete 2025-08-14T21:22:03.8850377Z bf535c931347: Pull complete 2025-08-14T21:22:21.1285682Z 545d12c0210f: Verifying Checksum 2025-08-14T21:22:21.1288154Z 545d12c0210f: Download complete 2025-08-14T21:22:21.2019657Z 3d4b192e6130: Verifying Checksum 2025-08-14T21:22:21.2019970Z 3d4b192e6130: Download complete 2025-08-14T21:22:21.2836448Z 1aef221cbe7d: Verifying Checksum 2025-08-14T21:22:21.2836935Z 1aef221cbe7d: Download complete 2025-08-14T21:22:21.3668029Z 6c6566b1f69c: Verifying Checksum 2025-08-14T21:22:21.3668440Z 6c6566b1f69c: Download complete 2025-08-14T21:22:21.4717589Z 5629b6e108ff: Verifying Checksum 2025-08-14T21:22:21.4718075Z 5629b6e108ff: Download complete 2025-08-14T21:22:21.5662891Z 3d11a12afe79: Verifying Checksum 2025-08-14T21:22:21.5663243Z 3d11a12afe79: Download complete 2025-08-14T21:22:21.6625415Z 7f43e20df7d5: Verifying Checksum 2025-08-14T21:22:21.6625747Z 7f43e20df7d5: Download complete 2025-08-14T21:22:21.7501686Z 1f1ad77e7661: Verifying Checksum 2025-08-14T21:22:21.7502033Z 1f1ad77e7661: Download complete 2025-08-14T21:22:21.8362132Z 2b28f013a7e0: Verifying Checksum 2025-08-14T21:22:21.8362504Z 2b28f013a7e0: Download complete 2025-08-14T21:22:21.9190093Z f2553a813bfd: Verifying Checksum 2025-08-14T21:22:21.9190455Z f2553a813bfd: Download complete 2025-08-14T21:22:22.0163753Z 920afab8d80c: Verifying Checksum 2025-08-14T21:22:22.0164175Z 920afab8d80c: Download complete 2025-08-14T21:22:22.1082181Z 49bf96108822: Download complete 2025-08-14T21:22:22.2194224Z 41a88d2df171: Verifying Checksum 2025-08-14T21:22:22.2194627Z 41a88d2df171: Download complete 2025-08-14T21:22:22.2936729Z ac86414debcc: Download complete 2025-08-14T21:22:22.3907666Z 41e0d573c7e6: Verifying Checksum 2025-08-14T21:22:22.3908000Z 41e0d573c7e6: Download complete 2025-08-14T21:22:24.9153335Z cd44cde965e2: Verifying Checksum 2025-08-14T21:22:24.9153808Z cd44cde965e2: Download complete 2025-08-14T21:22:25.1694236Z 53fa29ec5ecf: Download complete 2025-08-14T21:22:25.2478690Z 672e0afb1fbc: Verifying Checksum 2025-08-14T21:22:25.2479209Z 672e0afb1fbc: Download complete 2025-08-14T21:22:25.3520708Z 2fb460415c5c: Verifying Checksum 2025-08-14T21:22:25.3521216Z 2fb460415c5c: Download complete 2025-08-14T21:22:25.4306897Z 987f67564697: Verifying Checksum 2025-08-14T21:22:25.4307462Z 987f67564697: Download complete 2025-08-14T21:22:25.5206492Z ac6d9eb82fcf: Verifying Checksum 2025-08-14T21:22:25.5207081Z ac6d9eb82fcf: Download complete 2025-08-14T21:22:25.7597886Z 92b42b4592d6: Verifying Checksum 2025-08-14T21:22:25.7598363Z 92b42b4592d6: Download complete 2025-08-14T21:22:25.8472832Z d87965a1f597: Download complete 2025-08-14T21:22:26.4330140Z bf708065b046: Verifying Checksum 2025-08-14T21:22:26.4330476Z bf708065b046: Download complete 2025-08-14T21:22:30.9498819Z 5a5715912775: Verifying Checksum 2025-08-14T21:22:30.9499153Z 5a5715912775: Download complete 2025-08-14T21:23:08.7448233Z 12a76fc2e772: Download complete 2025-08-14T21:23:40.2867037Z 545d12c0210f: Pull complete 2025-08-14T21:23:40.4954092Z 4f4fb700ef54: Pull complete 2025-08-14T21:23:40.7133646Z b19f75074c26: Pull complete 2025-08-14T21:23:40.8909671Z f9818da262fd: Pull complete 2025-08-14T21:23:40.9908655Z a4c22264daca: Pull complete 2025-08-14T21:23:41.2419947Z 74df148c3c95: Pull complete 2025-08-14T21:23:41.4349090Z 9064a8aede51: Pull complete 2025-08-14T21:23:41.6550900Z 73685becdb22: Pull complete 2025-08-14T21:23:41.8270883Z b392343e1c1b: Pull complete 2025-08-14T21:23:42.0488478Z 68e364b192f0: Pull complete 2025-08-14T21:25:14.9903402Z 12a76fc2e772: Pull complete 2025-08-14T21:25:15.2148345Z 75d5be918885: Pull complete 2025-08-14T21:25:15.9056832Z 5043d38caf21: Pull complete 2025-08-14T21:25:16.1267433Z b95eb43b4a65: Pull complete 2025-08-14T21:25:16.3434884Z ecf009cc06f4: Pull complete 2025-08-14T21:25:25.1346950Z ec4bce64c4a9: Pull complete 2025-08-14T21:25:25.3596170Z 5e0f1f822a8c: Pull complete 2025-08-14T21:25:25.5797038Z 1ec8cdc0e895: Pull complete 2025-08-14T21:25:26.0302033Z 727e61c4e533: Pull complete 2025-08-14T21:25:26.2411723Z a93490f9df7f: Pull complete 2025-08-14T21:25:26.6641697Z b3f88809dbfe: Pull complete 2025-08-14T21:25:26.8846877Z 5590b4321046: Pull complete 2025-08-14T21:25:27.0944399Z 740bfb2f3383: Pull complete 2025-08-14T21:25:27.5418983Z d65a0285ce40: Pull complete 2025-08-14T21:25:27.7595425Z 2c6c790a94e2: Pull complete 2025-08-14T21:25:27.9786992Z 9d95dba818e5: Pull complete 2025-08-14T21:25:28.4128532Z 4381ffc02faf: Pull complete 2025-08-14T21:25:29.6204827Z 157c17059e69: Pull complete 2025-08-14T21:26:28.1213952Z 5a5715912775: Pull complete 2025-08-14T21:26:28.1443815Z 3d4b192e6130: Pull complete 2025-08-14T21:26:28.1661848Z 1aef221cbe7d: Pull complete 2025-08-14T21:26:28.2103247Z 6c6566b1f69c: Pull complete 2025-08-14T21:26:28.2537557Z 5629b6e108ff: Pull complete 2025-08-14T21:26:28.2770743Z 3d11a12afe79: Pull complete 2025-08-14T21:26:28.3215730Z 7f43e20df7d5: Pull complete 2025-08-14T21:26:28.3661832Z 1f1ad77e7661: Pull complete 2025-08-14T21:26:28.3886953Z 2b28f013a7e0: Pull complete 2025-08-14T21:26:28.4320168Z f2553a813bfd: Pull complete 2025-08-14T21:26:28.4552796Z 920afab8d80c: Pull complete 2025-08-14T21:26:28.4986103Z 49bf96108822: Pull complete 2025-08-14T21:26:28.5202184Z 41a88d2df171: Pull complete 2025-08-14T21:26:28.5629351Z ac86414debcc: Pull complete 2025-08-14T21:26:28.5869076Z 41e0d573c7e6: Pull complete 2025-08-14T21:26:35.6093699Z cd44cde965e2: Pull complete 2025-08-14T21:26:35.7482884Z 53fa29ec5ecf: Pull complete 2025-08-14T21:26:35.8596295Z 672e0afb1fbc: Pull complete 2025-08-14T21:26:35.9530399Z 2fb460415c5c: Pull complete 2025-08-14T21:26:36.1661961Z 987f67564697: Pull complete 2025-08-14T21:26:36.3780835Z ac6d9eb82fcf: Pull complete 2025-08-14T21:26:38.0609093Z 92b42b4592d6: Pull complete 2025-08-14T21:26:38.2090980Z d87965a1f597: Pull complete 2025-08-14T21:26:39.9385690Z bf708065b046: Pull complete 2025-08-14T21:26:40.1731410Z Digest: sha256:2912f52318e06e9a551538b828fea15def013f278443bb1f452fdd9a824ccbdf 2025-08-14T21:26:40.1972584Z Status: Downloaded newer image for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe 2025-08-14T21:26:40.2051100Z 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe 2025-08-14T21:26:40.2120580Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-08-14T21:26:40.2121446Z echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-08-14T21:26:40.2134865Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:26:40.2135223Z env: 2025-08-14T21:26:40.2135438Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:26:40.2135684Z ##[endgroup] 2025-08-14T21:26:40.2309135Z ##[group]Run pytorch/test-infra/.github/actions/setup-nvidia@main 2025-08-14T21:26:40.2309546Z with: 2025-08-14T21:26:40.2309760Z driver-version: 570.133.07 2025-08-14T21:26:40.2310017Z env: 2025-08-14T21:26:40.2310225Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:26:40.2310473Z ##[endgroup] 2025-08-14T21:26:40.2379539Z ##[group]Run nick-fields/retry@3e91a01664abd3c5cd539100d10d33b9c5b68482 2025-08-14T21:26:40.2379981Z with: 2025-08-14T21:26:40.2380185Z timeout_minutes: 10 2025-08-14T21:26:40.2380429Z max_attempts: 3 2025-08-14T21:26:40.2404934Z command: # Is it disgusting to have a full shell script here in this github action? Sure # But is it the best way to make it so that this action relies on nothing else? Absolutely set -eou pipefail DISTRIBUTION=$(. /etc/os-release;echo $ID$VERSION_ID) DRIVER_FN="NVIDIA-Linux-x86_64-${DRIVER_VERSION}.run" install_nvidia_docker2_amzn2() { ( set -x # Needed for yum-config-manager sudo yum install -y yum-utils if [[ "${DISTRIBUTION}" == "amzn2023" ]] ; then YUM_REPO_URL="https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo" else # Amazon Linux 2 YUM_REPO_URL="https://nvidia.github.io/nvidia-docker/${DISTRIBUTION}/nvidia-docker.repo" fi sudo yum-config-manager --add-repo "${YUM_REPO_URL}" sudo yum install -y \ nvidia-container-toolkit-1.17.8 \ libnvidia-container-tools-1.17.8 \ libnvidia-container1-1.17.8 \ nvidia-container-toolkit-base-1.17.8 sudo systemctl restart docker ) } install_nvidia_docker2_ubuntu20() { ( set -x # Install nvidia-driver package if not installed status="$(dpkg-query -W --showformat='${db:Status-Status}' nvidia-docker2 2>&1)" if [ ! $? = 0 ] || [ ! "$status" = installed ]; then sudo apt-get install -y nvidia-container-toolkit-1.17.8 sudo systemctl restart docker fi ) } pre_install_nvidia_driver_amzn2() { ( # Purge any nvidia driver installed from RHEL repo sudo yum remove -y nvidia-driver-latest-dkms ) } install_nvidia_driver_common() { ( # Try to gather more information about the runner and its existing NVIDIA driver if any echo "Before installing NVIDIA driver" lspci lsmod modinfo nvidia || true HAS_NVIDIA_DRIVER=0 # Check if NVIDIA driver has already been installed if [ -x "$(command -v nvidia-smi)" ]; then set +e # The driver exists, check its version next. Also check only the first GPU if there are more than one of them # so that the same driver version is not print over multiple lines INSTALLED_DRIVER_VERSION=$(nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0) NVIDIA_SMI_STATUS=$? if [ "$NVIDIA_SMI_STATUS" -ne 0 ] && [ "$NVIDIA_SMI_STATUS" -ne 14 ]; then echo "Failed to get NVIDIA driver version ($INSTALLED_DRIVER_VERSION). Continuing" elif [ "$INSTALLED_DRIVER_VERSION" != "$DRIVER_VERSION" ]; then echo "NVIDIA driver ($INSTALLED_DRIVER_VERSION) has been installed, but we expect to have $DRIVER_VERSION instead. Continuing" # Turn off persistent mode so that the installation script can unload the kernel module sudo killall nvidia-persistenced || true else HAS_NVIDIA_DRIVER=1 echo "NVIDIA driver ($INSTALLED_DRIVER_VERSION) has already been installed. Skipping NVIDIA driver installation" fi set -e fi if [ "$HAS_NVIDIA_DRIVER" -eq 0 ]; then # CAUTION: this may need to be updated in future if [ "${DISTRIBUTION}" != ubuntu20.04 ]; then sudo yum groupinstall -y "Development Tools" # ensure our kernel install is the same as our underlying kernel, # groupinstall "Development Tools" has a habit of mismatching kernel headers sudo yum install -y "kernel-devel-uname-r == $(uname -r)" sudo modprobe backlight fi sudo curl -fsL -o /tmp/nvidia_driver "https://s3.amazonaws.com/ossci-linux/nvidia_driver/$DRIVER_FN" set +e sudo /bin/bash /tmp/nvidia_driver -s --no-drm NVIDIA_INSTALLATION_STATUS=$? RESET_GPU=0 if [ "$NVIDIA_INSTALLATION_STATUS" -ne 0 ]; then sudo cat /var/log/nvidia-installer.log # Fail to install NVIDIA driver, try to reset the GPU RESET_GPU=1 elif [ -x "$(command -v nvidia-smi)" ]; then # Check again if nvidia-smi works even if the driver installation completes successfully INSTALLED_DRIVER_VERSION=$(nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0) NVIDIA_SMI_STATUS=$? if [ "$NVIDIA_SMI_STATUS" -ne 0 ] && [ "$NVIDIA_SMI_STATUS" -ne 14 ]; then RESET_GPU=1 fi fi if [ "$RESET_GPU" -eq 1 ]; then NVIDIA_DEVICES=$(lspci -D | grep -i NVIDIA | cut -d' ' -f1) # The GPU can get stuck in a failure state if somehow the test crashs the GPU microcode. When this # happens, we'll try to reset all NVIDIA devices https://github.com/pytorch/pytorch/issues/88388 for PCI_ID in $NVIDIA_DEVICES; do DEVICE_ENABLED=$(cat /sys/bus/pci/devices/$PCI_ID/enable) echo "Reseting $PCI_ID (enabled state: $DEVICE_ENABLED)" # This requires sudo permission of course echo "1" | sudo tee /sys/bus/pci/devices/$PCI_ID/reset sleep 1 done fi sudo rm -fv /tmp/nvidia_driver set -e fi ) } post_install_nvidia_driver_common() { ( sudo modprobe nvidia || true echo "After installing NVIDIA driver" lspci lsmod modinfo nvidia || true ( set +e nvidia-smi # NB: Annoyingly, nvidia-smi command returns successfully with return code 0 even in # the case where the driver has already crashed as it still can get the driver version # and some basic information like the bus ID. However, the rest of the information # would be missing (ERR!), for example: # # +-----------------------------------------------------------------------------+ # | NVIDIA-SMI 525.89.02 Driver Version: 525.89.02 CUDA Version: 12.0 | # |-------------------------------+----------------------+----------------------+ # | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | # | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | # | | | MIG M. | # |===============================+======================+======================| # | 0 ERR! Off | 00000000:00:1E.0 Off | ERR! | # |ERR! ERR! ERR! ERR! / ERR! | 4184MiB / 23028MiB | ERR! Default | # | | | ERR! | # +-------------------------------+----------------------+----------------------+ # # +-----------------------------------------------------------------------------+ # | Processes: | # | GPU GI CI PID Type Process name GPU Memory | # | ID ID Usage | # |=============================================================================| # +-----------------------------------------------------------------------------+ # # This should be reported as a failure instead as it will guarantee to fail when # Docker tries to run with --gpus all # # So, the correct check here is to query one of the missing piece of info like # GPU name, so that the command can fail accordingly nvidia-smi --query-gpu=gpu_name --format=csv,noheader --id=0 NVIDIA_SMI_STATUS=$? # Allowable exit statuses for nvidia-smi, see: https://github.com/NVIDIA/gpu-operator/issues/285 if [ "$NVIDIA_SMI_STATUS" -eq 0 ] || [ "$NVIDIA_SMI_STATUS" -eq 14 ]; then echo "INFO: Ignoring allowed status ${NVIDIA_SMI_STATUS}" else echo "ERROR: nvidia-smi exited with unresolved status ${NVIDIA_SMI_STATUS}" exit ${NVIDIA_SMI_STATUS} fi set -e ) ) } install_nvidia_driver_amzn2() { ( set -x pre_install_nvidia_driver_amzn2 install_nvidia_driver_common post_install_nvidia_driver_common ) } install_nvidia_driver_ubuntu20() { ( set -x install_nvidia_driver_common post_install_nvidia_driver_common ) } echo "== Installing nvidia driver ${DRIVER_FN} ==" case "${DISTRIBUTION}" in amzn*) install_nvidia_driver_amzn2 ;; ubuntu20.04) install_nvidia_driver_ubuntu20 ;; *) echo "ERROR: Unknown distribution ${DISTRIBUTION}" exit 1 ;; esac # Install container toolkit based on distribution echo "== Installing nvidia container toolkit for ${DISTRIBUTION} ==" case "${DISTRIBUTION}" in amzn*) install_nvidia_docker2_amzn2 ;; ubuntu20.04) install_nvidia_docker2_ubuntu20 ;; *) echo "ERROR: Unknown distribution ${DISTRIBUTION}" exit 1 ;; esac echo "GPU_FLAG=--gpus all -e NVIDIA_DRIVER_CAPABILITIES=all" >> "${GITHUB_ENV}" # Fix https://github.com/NVIDIA/nvidia-docker/issues/1648 on runners with # more than one GPUs. This just needs to be run once. The command fails # on subsequent runs and complains that the mode is already on, but that's # ok sudo nvidia-persistenced || true # This should show persistence mode ON nvidia-smi # check if the container-toolkit is correctly installed and CUDA is available inside a container docker run --rm -t --gpus=all public.ecr.aws/docker/library/python:3.13 nvidia-smi 2025-08-14T21:26:40.2429511Z retry_wait_seconds: 10 2025-08-14T21:26:40.2429805Z polling_interval_seconds: 1 2025-08-14T21:26:40.2430097Z warning_on_retry: true 2025-08-14T21:26:40.2430352Z continue_on_error: false 2025-08-14T21:26:40.2430601Z env: 2025-08-14T21:26:40.2430802Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:26:40.2431065Z DRIVER_VERSION: 570.133.07 2025-08-14T21:26:40.2431320Z ##[endgroup] 2025-08-14T21:26:40.3411352Z == Installing nvidia driver NVIDIA-Linux-x86_64-570.133.07.run == 2025-08-14T21:26:40.3412501Z + pre_install_nvidia_driver_amzn2 2025-08-14T21:26:40.3416409Z + sudo yum remove -y nvidia-driver-latest-dkms 2025-08-14T21:26:40.9923261Z No match for argument: nvidia-driver-latest-dkms 2025-08-14T21:26:40.9923785Z No packages marked for removal. 2025-08-14T21:26:40.9988506Z Dependencies resolved. 2025-08-14T21:26:40.9998395Z Nothing to do. 2025-08-14T21:26:40.9998989Z Complete! 2025-08-14T21:26:41.1018313Z + install_nvidia_driver_common 2025-08-14T21:26:41.1023459Z + echo 'Before installing NVIDIA driver' 2025-08-14T21:26:41.1023885Z + lspci 2025-08-14T21:26:41.1025437Z Before installing NVIDIA driver 2025-08-14T21:26:41.2241123Z 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] 2025-08-14T21:26:41.2242434Z 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 2025-08-14T21:26:41.2243697Z 00:01.3 Non-VGA unclassified device: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 08) 2025-08-14T21:26:41.2245560Z 00:03.0 VGA compatible controller: Amazon.com, Inc. Device 1111 2025-08-14T21:26:41.2246878Z 00:04.0 Non-Volatile memory controller: Amazon.com, Inc. NVMe EBS Controller 2025-08-14T21:26:41.2248596Z 00:05.0 Ethernet controller: Amazon.com, Inc. Elastic Network Adapter (ENA) 2025-08-14T21:26:41.2249556Z 00:1e.0 3D controller: NVIDIA Corporation GA102GL [A10G] (rev a1) 2025-08-14T21:26:41.2250155Z 00:1f.0 Non-Volatile memory controller: Amazon.com, Inc. NVMe SSD Controller 2025-08-14T21:26:41.2250564Z + lsmod 2025-08-14T21:26:41.2298981Z Module Size Used by 2025-08-14T21:26:41.2299433Z xt_conntrack 16384 1 2025-08-14T21:26:41.2299797Z nft_chain_nat 16384 3 2025-08-14T21:26:41.2300164Z xt_MASQUERADE 20480 1 2025-08-14T21:26:41.2300495Z nf_nat 57344 2 nft_chain_nat,xt_MASQUERADE 2025-08-14T21:26:41.2300870Z nf_conntrack_netlink 57344 0 2025-08-14T21:26:41.2301426Z nf_conntrack 184320 4 xt_conntrack,nf_nat,nf_conntrack_netlink,xt_MASQUERADE 2025-08-14T21:26:41.2302033Z nf_defrag_ipv6 24576 1 nf_conntrack 2025-08-14T21:26:41.2302386Z nf_defrag_ipv4 16384 1 nf_conntrack 2025-08-14T21:26:41.2302698Z xfrm_user 57344 1 2025-08-14T21:26:41.2302981Z xfrm_algo 16384 1 xfrm_user 2025-08-14T21:26:41.2303267Z xt_addrtype 16384 2 2025-08-14T21:26:41.2303534Z nft_compat 20480 4 2025-08-14T21:26:41.2303846Z nf_tables 311296 57 nft_compat,nft_chain_nat 2025-08-14T21:26:41.2304260Z nfnetlink 20480 4 nft_compat,nf_conntrack_netlink,nf_tables 2025-08-14T21:26:41.2304643Z br_netfilter 36864 0 2025-08-14T21:26:41.2304930Z bridge 323584 1 br_netfilter 2025-08-14T21:26:41.2305235Z stp 16384 1 bridge 2025-08-14T21:26:41.2305520Z llc 16384 2 bridge,stp 2025-08-14T21:26:41.2305808Z overlay 167936 0 2025-08-14T21:26:41.2306069Z tls 139264 0 2025-08-14T21:26:41.2306319Z nls_ascii 16384 1 2025-08-14T21:26:41.2306585Z nls_cp437 20480 1 2025-08-14T21:26:41.2307038Z vfat 24576 1 2025-08-14T21:26:41.2307294Z fat 86016 1 vfat 2025-08-14T21:26:41.2307567Z sunrpc 700416 1 2025-08-14T21:26:41.2307829Z ghash_clmulni_intel 16384 0 2025-08-14T21:26:41.2308085Z i8042 45056 0 2025-08-14T21:26:41.2308348Z serio 28672 3 i8042 2025-08-14T21:26:41.2308627Z ena 180224 0 2025-08-14T21:26:41.2308877Z button 24576 0 2025-08-14T21:26:41.2309139Z sch_fq_codel 20480 17 2025-08-14T21:26:41.2309403Z fuse 184320 1 2025-08-14T21:26:41.2309657Z dm_mod 188416 0 2025-08-14T21:26:41.2309907Z loop 36864 0 2025-08-14T21:26:41.2310161Z configfs 57344 1 2025-08-14T21:26:41.2310423Z dmi_sysfs 20480 0 2025-08-14T21:26:41.2310678Z crc32_pclmul 16384 0 2025-08-14T21:26:41.2310944Z crc32c_intel 24576 0 2025-08-14T21:26:41.2311204Z efivarfs 24576 1 2025-08-14T21:26:41.2311474Z + modinfo nvidia 2025-08-14T21:26:41.2320293Z filename: /lib/modules/6.1.141-155.222.amzn2023.x86_64/kernel/drivers/video/nvidia.ko 2025-08-14T21:26:41.2320907Z import_ns: DMA_BUF 2025-08-14T21:26:41.2321232Z alias: char-major-195-* 2025-08-14T21:26:41.2321505Z version: 570.133.07 2025-08-14T21:26:41.2321777Z supported: external 2025-08-14T21:26:41.2322129Z license: Dual MIT/GPL 2025-08-14T21:26:41.2322540Z firmware: nvidia/570.133.07/gsp_tu10x.bin 2025-08-14T21:26:41.2323001Z firmware: nvidia/570.133.07/gsp_ga10x.bin 2025-08-14T21:26:41.2323440Z srcversion: 49515739FD8F721A3F2F714 2025-08-14T21:26:41.2323796Z alias: pci:v000010DEd*sv*sd*bc06sc80i00* 2025-08-14T21:26:41.2324137Z alias: pci:v000010DEd*sv*sd*bc03sc02i00* 2025-08-14T21:26:41.2324608Z alias: pci:v000010DEd*sv*sd*bc03sc00i00* 2025-08-14T21:26:41.2325239Z depends: i2c-core,drm 2025-08-14T21:26:41.2325591Z retpoline: Y 2025-08-14T21:26:41.2325887Z name: nvidia 2025-08-14T21:26:41.2326255Z vermagic: 6.1.141-155.222.amzn2023.x86_64 SMP preempt mod_unload modversions 2025-08-14T21:26:41.2326790Z parm: NvSwitchRegDwords:NvSwitch regkey (charp) 2025-08-14T21:26:41.2327399Z parm: NvSwitchBlacklist:NvSwitchBlacklist=uuid[,uuid...] (charp) 2025-08-14T21:26:41.2327945Z parm: NVreg_ResmanDebugLevel:int 2025-08-14T21:26:41.2328268Z parm: NVreg_RmLogonRC:int 2025-08-14T21:26:41.2328578Z parm: NVreg_ModifyDeviceFiles:int 2025-08-14T21:26:41.2328903Z parm: NVreg_DeviceFileUID:int 2025-08-14T21:26:41.2329219Z parm: NVreg_DeviceFileGID:int 2025-08-14T21:26:41.2329531Z parm: NVreg_DeviceFileMode:int 2025-08-14T21:26:41.2329922Z parm: NVreg_InitializeSystemMemoryAllocations:int 2025-08-14T21:26:41.2330454Z parm: NVreg_UsePageAttributeTable:int 2025-08-14T21:26:41.2330905Z parm: NVreg_EnablePCIeGen3:int 2025-08-14T21:26:41.2331301Z parm: NVreg_EnableMSI:int 2025-08-14T21:26:41.2331622Z parm: NVreg_EnableStreamMemOPs:int 2025-08-14T21:26:41.2331987Z parm: NVreg_RestrictProfilingToAdminUsers:int 2025-08-14T21:26:41.2332393Z parm: NVreg_PreserveVideoMemoryAllocations:int 2025-08-14T21:26:41.2332785Z parm: NVreg_EnableS0ixPowerManagement:int 2025-08-14T21:26:41.2333233Z parm: NVreg_S0ixPowerManagementVideoMemoryThreshold:int 2025-08-14T21:26:41.2333786Z parm: NVreg_DynamicPowerManagement:int 2025-08-14T21:26:41.2334351Z parm: NVreg_DynamicPowerManagementVideoMemoryThreshold:int 2025-08-14T21:26:41.2334780Z parm: NVreg_EnableGpuFirmware:int 2025-08-14T21:26:41.2335119Z parm: NVreg_EnableGpuFirmwareLogs:int 2025-08-14T21:26:41.2335501Z parm: NVreg_OpenRmEnableUnsupportedGpus:int 2025-08-14T21:26:41.2335890Z parm: NVreg_EnableUserNUMAManagement:int 2025-08-14T21:26:41.2336229Z parm: NVreg_MemoryPoolSize:int 2025-08-14T21:26:41.2336669Z parm: NVreg_KMallocHeapMaxSize:int 2025-08-14T21:26:41.2337008Z parm: NVreg_VMallocHeapMaxSize:int 2025-08-14T21:26:41.2337336Z parm: NVreg_IgnoreMMIOCheck:int 2025-08-14T21:26:41.2337646Z parm: NVreg_NvLinkDisable:int 2025-08-14T21:26:41.2337999Z parm: NVreg_EnablePCIERelaxedOrderingMode:int 2025-08-14T21:26:41.2338371Z parm: NVreg_RegisterPCIDriver:int 2025-08-14T21:26:41.2338709Z parm: NVreg_EnableResizableBar:int 2025-08-14T21:26:41.2339053Z parm: NVreg_EnableDbgBreakpoint:int 2025-08-14T21:26:41.2339408Z parm: NVreg_EnableNonblockingOpen:int 2025-08-14T21:26:41.2339749Z parm: NVreg_RegistryDwords:charp 2025-08-14T21:26:41.2340102Z parm: NVreg_RegistryDwordsPerDevice:charp 2025-08-14T21:26:41.2340439Z parm: NVreg_RmMsg:charp 2025-08-14T21:26:41.2340738Z parm: NVreg_GpuBlacklist:charp 2025-08-14T21:26:41.2341107Z parm: NVreg_TemporaryFilePath:charp 2025-08-14T21:26:41.2341443Z parm: NVreg_ExcludedGpus:charp 2025-08-14T21:26:41.2341762Z parm: NVreg_DmaRemapPeerMmio:int 2025-08-14T21:26:41.2342088Z parm: NVreg_RmNvlinkBandwidth:charp 2025-08-14T21:26:41.2342450Z parm: NVreg_RmNvlinkBandwidthLinkCount:int 2025-08-14T21:26:41.2342802Z parm: NVreg_ImexChannelCount:int 2025-08-14T21:26:41.2343125Z parm: NVreg_CreateImexChannel0:int 2025-08-14T21:26:41.2343476Z parm: NVreg_GrdmaPciTopoCheckOverride:int 2025-08-14T21:26:41.2343818Z parm: rm_firmware_active:charp 2025-08-14T21:26:41.2344102Z + HAS_NVIDIA_DRIVER=0 2025-08-14T21:26:41.2344349Z ++ command -v nvidia-smi 2025-08-14T21:26:41.2344614Z + '[' -x /usr/bin/nvidia-smi ']' 2025-08-14T21:26:41.2344867Z + set +e 2025-08-14T21:26:41.2345183Z ++ nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0 2025-08-14T21:26:43.0851872Z + INSTALLED_DRIVER_VERSION=570.133.07 2025-08-14T21:26:43.0852263Z + NVIDIA_SMI_STATUS=0 2025-08-14T21:26:43.0852555Z + '[' 0 -ne 0 ']' 2025-08-14T21:26:43.0852867Z + '[' 570.133.07 '!=' 570.133.07 ']' 2025-08-14T21:26:43.0853218Z + HAS_NVIDIA_DRIVER=1 2025-08-14T21:26:43.0853656Z + echo 'NVIDIA driver (570.133.07) has already been installed. Skipping NVIDIA driver installation' 2025-08-14T21:26:43.0854121Z + set -e 2025-08-14T21:26:43.0854312Z + '[' 1 -eq 0 ']' 2025-08-14T21:26:43.0854701Z NVIDIA driver (570.133.07) has already been installed. Skipping NVIDIA driver installation 2025-08-14T21:26:43.0855211Z + post_install_nvidia_driver_common 2025-08-14T21:26:43.0858964Z + sudo modprobe nvidia 2025-08-14T21:26:43.2254016Z + echo 'After installing NVIDIA driver' 2025-08-14T21:26:43.2254326Z + lspci 2025-08-14T21:26:43.2254541Z After installing NVIDIA driver 2025-08-14T21:26:43.2381135Z 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] 2025-08-14T21:26:43.2381818Z 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 2025-08-14T21:26:43.2382383Z 00:01.3 Non-VGA unclassified device: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 08) 2025-08-14T21:26:43.2382924Z 00:03.0 VGA compatible controller: Amazon.com, Inc. Device 1111 2025-08-14T21:26:43.2383397Z 00:04.0 Non-Volatile memory controller: Amazon.com, Inc. NVMe EBS Controller 2025-08-14T21:26:43.2384091Z 00:05.0 Ethernet controller: Amazon.com, Inc. Elastic Network Adapter (ENA) 2025-08-14T21:26:43.2384742Z 00:1e.0 3D controller: NVIDIA Corporation GA102GL [A10G] (rev a1) 2025-08-14T21:26:43.2385228Z 00:1f.0 Non-Volatile memory controller: Amazon.com, Inc. NVMe SSD Controller 2025-08-14T21:26:43.2385624Z + lsmod 2025-08-14T21:26:43.2423383Z Module Size Used by 2025-08-14T21:26:43.2423785Z nvidia_uvm 1884160 0 2025-08-14T21:26:43.2424159Z nvidia 11583488 1 nvidia_uvm 2025-08-14T21:26:43.2424451Z drm 602112 1 nvidia 2025-08-14T21:26:43.2424773Z drm_panel_orientation_quirks 32768 1 drm 2025-08-14T21:26:43.2425090Z backlight 24576 1 drm 2025-08-14T21:26:43.2425879Z i2c_core 110592 2 nvidia,drm 2025-08-14T21:26:43.2426237Z xt_conntrack 16384 1 2025-08-14T21:26:43.2426502Z nft_chain_nat 16384 3 2025-08-14T21:26:43.2426852Z xt_MASQUERADE 20480 1 2025-08-14T21:26:43.2427279Z nf_nat 57344 2 nft_chain_nat,xt_MASQUERADE 2025-08-14T21:26:43.2427724Z nf_conntrack_netlink 57344 0 2025-08-14T21:26:43.2428218Z nf_conntrack 184320 4 xt_conntrack,nf_nat,nf_conntrack_netlink,xt_MASQUERADE 2025-08-14T21:26:43.2428653Z nf_defrag_ipv6 24576 1 nf_conntrack 2025-08-14T21:26:43.2428973Z nf_defrag_ipv4 16384 1 nf_conntrack 2025-08-14T21:26:43.2429267Z xfrm_user 57344 1 2025-08-14T21:26:43.2429542Z xfrm_algo 16384 1 xfrm_user 2025-08-14T21:26:43.2429833Z xt_addrtype 16384 2 2025-08-14T21:26:43.2430095Z nft_compat 20480 4 2025-08-14T21:26:43.2430405Z nf_tables 311296 57 nft_compat,nft_chain_nat 2025-08-14T21:26:43.2430840Z nfnetlink 20480 4 nft_compat,nf_conntrack_netlink,nf_tables 2025-08-14T21:26:43.2431216Z br_netfilter 36864 0 2025-08-14T21:26:43.2431497Z bridge 323584 1 br_netfilter 2025-08-14T21:26:43.2431789Z stp 16384 1 bridge 2025-08-14T21:26:43.2432078Z llc 16384 2 bridge,stp 2025-08-14T21:26:43.2432366Z overlay 167936 0 2025-08-14T21:26:43.2432614Z tls 139264 0 2025-08-14T21:26:43.2432868Z nls_ascii 16384 1 2025-08-14T21:26:43.2433122Z nls_cp437 20480 1 2025-08-14T21:26:43.2433374Z vfat 24576 1 2025-08-14T21:26:43.2433632Z fat 86016 1 vfat 2025-08-14T21:26:43.2433903Z sunrpc 700416 1 2025-08-14T21:26:43.2434162Z ghash_clmulni_intel 16384 0 2025-08-14T21:26:43.2434425Z i8042 45056 0 2025-08-14T21:26:43.2434874Z serio 28672 3 i8042 2025-08-14T21:26:43.2435153Z ena 180224 0 2025-08-14T21:26:43.2435411Z button 24576 0 2025-08-14T21:26:43.2435671Z sch_fq_codel 20480 17 2025-08-14T21:26:43.2435931Z fuse 184320 1 2025-08-14T21:26:43.2436181Z dm_mod 188416 0 2025-08-14T21:26:43.2436510Z loop 36864 0 2025-08-14T21:26:43.2436824Z configfs 57344 1 2025-08-14T21:26:43.2437091Z dmi_sysfs 20480 0 2025-08-14T21:26:43.2437443Z crc32_pclmul 16384 0 2025-08-14T21:26:43.2437800Z crc32c_intel 24576 0 2025-08-14T21:26:43.2438082Z efivarfs 24576 1 2025-08-14T21:26:43.2438350Z + modinfo nvidia 2025-08-14T21:26:43.2446639Z filename: /lib/modules/6.1.141-155.222.amzn2023.x86_64/kernel/drivers/video/nvidia.ko 2025-08-14T21:26:43.2447549Z import_ns: DMA_BUF 2025-08-14T21:26:43.2447877Z alias: char-major-195-* 2025-08-14T21:26:43.2448163Z version: 570.133.07 2025-08-14T21:26:43.2448417Z supported: external 2025-08-14T21:26:43.2448667Z license: Dual MIT/GPL 2025-08-14T21:26:43.2448958Z firmware: nvidia/570.133.07/gsp_tu10x.bin 2025-08-14T21:26:43.2449305Z firmware: nvidia/570.133.07/gsp_ga10x.bin 2025-08-14T21:26:43.2449620Z srcversion: 49515739FD8F721A3F2F714 2025-08-14T21:26:43.2449944Z alias: pci:v000010DEd*sv*sd*bc06sc80i00* 2025-08-14T21:26:43.2450283Z alias: pci:v000010DEd*sv*sd*bc03sc02i00* 2025-08-14T21:26:43.2450615Z alias: pci:v000010DEd*sv*sd*bc03sc00i00* 2025-08-14T21:26:43.2450921Z depends: i2c-core,drm 2025-08-14T21:26:43.2451182Z retpoline: Y 2025-08-14T21:26:43.2451404Z name: nvidia 2025-08-14T21:26:43.2451760Z vermagic: 6.1.141-155.222.amzn2023.x86_64 SMP preempt mod_unload modversions 2025-08-14T21:26:43.2452228Z parm: NvSwitchRegDwords:NvSwitch regkey (charp) 2025-08-14T21:26:43.2452675Z parm: NvSwitchBlacklist:NvSwitchBlacklist=uuid[,uuid...] (charp) 2025-08-14T21:26:43.2453088Z parm: NVreg_ResmanDebugLevel:int 2025-08-14T21:26:43.2453601Z parm: NVreg_RmLogonRC:int 2025-08-14T21:26:43.2453907Z parm: NVreg_ModifyDeviceFiles:int 2025-08-14T21:26:43.2454225Z parm: NVreg_DeviceFileUID:int 2025-08-14T21:26:43.2454522Z parm: NVreg_DeviceFileGID:int 2025-08-14T21:26:43.2454829Z parm: NVreg_DeviceFileMode:int 2025-08-14T21:26:43.2455197Z parm: NVreg_InitializeSystemMemoryAllocations:int 2025-08-14T21:26:43.2455578Z parm: NVreg_UsePageAttributeTable:int 2025-08-14T21:26:43.2455912Z parm: NVreg_EnablePCIeGen3:int 2025-08-14T21:26:43.2456212Z parm: NVreg_EnableMSI:int 2025-08-14T21:26:43.2456512Z parm: NVreg_EnableStreamMemOPs:int 2025-08-14T21:26:43.2456875Z parm: NVreg_RestrictProfilingToAdminUsers:int 2025-08-14T21:26:43.2457270Z parm: NVreg_PreserveVideoMemoryAllocations:int 2025-08-14T21:26:43.2457648Z parm: NVreg_EnableS0ixPowerManagement:int 2025-08-14T21:26:43.2458067Z parm: NVreg_S0ixPowerManagementVideoMemoryThreshold:int 2025-08-14T21:26:43.2458473Z parm: NVreg_DynamicPowerManagement:int 2025-08-14T21:26:43.2458889Z parm: NVreg_DynamicPowerManagementVideoMemoryThreshold:int 2025-08-14T21:26:43.2459291Z parm: NVreg_EnableGpuFirmware:int 2025-08-14T21:26:43.2459629Z parm: NVreg_EnableGpuFirmwareLogs:int 2025-08-14T21:26:43.2459998Z parm: NVreg_OpenRmEnableUnsupportedGpus:int 2025-08-14T21:26:43.2460365Z parm: NVreg_EnableUserNUMAManagement:int 2025-08-14T21:26:43.2460709Z parm: NVreg_MemoryPoolSize:int 2025-08-14T21:26:43.2461035Z parm: NVreg_KMallocHeapMaxSize:int 2025-08-14T21:26:43.2461359Z parm: NVreg_VMallocHeapMaxSize:int 2025-08-14T21:26:43.2461685Z parm: NVreg_IgnoreMMIOCheck:int 2025-08-14T21:26:43.2462002Z parm: NVreg_NvLinkDisable:int 2025-08-14T21:26:43.2463545Z parm: NVreg_EnablePCIERelaxedOrderingMode:int 2025-08-14T21:26:43.2463926Z parm: NVreg_RegisterPCIDriver:int 2025-08-14T21:26:43.2464265Z parm: NVreg_EnableResizableBar:int 2025-08-14T21:26:43.2464606Z parm: NVreg_EnableDbgBreakpoint:int 2025-08-14T21:26:43.2464954Z parm: NVreg_EnableNonblockingOpen:int 2025-08-14T21:26:43.2465296Z parm: NVreg_RegistryDwords:charp 2025-08-14T21:26:43.2465642Z parm: NVreg_RegistryDwordsPerDevice:charp 2025-08-14T21:26:43.2465971Z parm: NVreg_RmMsg:charp 2025-08-14T21:26:43.2466269Z parm: NVreg_GpuBlacklist:charp 2025-08-14T21:26:43.2466599Z parm: NVreg_TemporaryFilePath:charp 2025-08-14T21:26:43.2466932Z parm: NVreg_ExcludedGpus:charp 2025-08-14T21:26:43.2467247Z parm: NVreg_DmaRemapPeerMmio:int 2025-08-14T21:26:43.2467579Z parm: NVreg_RmNvlinkBandwidth:charp 2025-08-14T21:26:43.2467937Z parm: NVreg_RmNvlinkBandwidthLinkCount:int 2025-08-14T21:26:43.2468291Z parm: NVreg_ImexChannelCount:int 2025-08-14T21:26:43.2468628Z parm: NVreg_CreateImexChannel0:int 2025-08-14T21:26:43.2468980Z parm: NVreg_GrdmaPciTopoCheckOverride:int 2025-08-14T21:26:43.2469317Z parm: rm_firmware_active:charp 2025-08-14T21:26:43.2469603Z + set +e 2025-08-14T21:26:43.2469802Z + nvidia-smi 2025-08-14T21:26:44.6412382Z Thu Aug 14 21:26:44 2025 2025-08-14T21:26:44.6412775Z +-----------------------------------------------------------------------------------------+ 2025-08-14T21:26:44.6413272Z | NVIDIA-SMI 570.133.07 Driver Version: 570.133.07 CUDA Version: 12.8 | 2025-08-14T21:26:44.6413752Z |-----------------------------------------+------------------------+----------------------+ 2025-08-14T21:26:44.6414238Z | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | 2025-08-14T21:26:44.6414790Z | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | 2025-08-14T21:26:44.6415218Z | | | MIG M. | 2025-08-14T21:26:44.6415893Z |=========================================+========================+======================| 2025-08-14T21:26:44.6477484Z | 0 NVIDIA A10G Off | 00000000:00:1E.0 Off | 0 | 2025-08-14T21:26:44.6477937Z | 0% 35C P0 66W / 300W | 0MiB / 23028MiB | 4% Default | 2025-08-14T21:26:44.6478330Z | | | N/A | 2025-08-14T21:26:44.6478728Z +-----------------------------------------+------------------------+----------------------+ 2025-08-14T21:26:44.6479136Z 2025-08-14T21:26:44.6479536Z +-----------------------------------------------------------------------------------------+ 2025-08-14T21:26:44.6479967Z | Processes: | 2025-08-14T21:26:44.6480417Z | GPU GI CI PID Type Process name GPU Memory | 2025-08-14T21:26:44.6480828Z | ID ID Usage | 2025-08-14T21:26:44.6481172Z |=========================================================================================| 2025-08-14T21:26:44.6481852Z | No running processes found | 2025-08-14T21:26:45.0587248Z +-----------------------------------------------------------------------------------------+ 2025-08-14T21:26:45.0587776Z + nvidia-smi --query-gpu=gpu_name --format=csv,noheader --id=0 2025-08-14T21:26:46.4527273Z NVIDIA A10G 2025-08-14T21:26:46.7188825Z + NVIDIA_SMI_STATUS=0 2025-08-14T21:26:46.7189120Z + '[' 0 -eq 0 ']' 2025-08-14T21:26:46.7189383Z + echo 'INFO: Ignoring allowed status 0' 2025-08-14T21:26:46.7189700Z + set -e 2025-08-14T21:26:46.7190286Z INFO: Ignoring allowed status 0 2025-08-14T21:26:46.7198335Z == Installing nvidia container toolkit for amzn2023 == 2025-08-14T21:26:46.7214596Z + sudo yum install -y yum-utils 2025-08-14T21:26:47.1502236Z Last metadata expiration check: 0:11:05 ago on Thu Aug 14 21:15:42 2025. 2025-08-14T21:26:47.1743292Z Package dnf-utils-4.3.0-13.amzn2023.0.5.noarch is already installed. 2025-08-14T21:26:47.2215826Z Dependencies resolved. 2025-08-14T21:26:47.2440623Z Nothing to do. 2025-08-14T21:26:47.2440864Z Complete! 2025-08-14T21:26:47.5373041Z + [[ amzn2023 == \a\m\z\n\2\0\2\3 ]] 2025-08-14T21:26:47.5373750Z + YUM_REPO_URL=https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo 2025-08-14T21:26:47.5374606Z + sudo yum-config-manager --add-repo https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo 2025-08-14T21:26:47.8991220Z Adding repo from: https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo 2025-08-14T21:26:47.9474547Z + sudo yum install -y nvidia-container-toolkit-1.17.8 libnvidia-container-tools-1.17.8 libnvidia-container1-1.17.8 nvidia-container-toolkit-base-1.17.8 2025-08-14T21:26:48.4839934Z nvidia-container-toolkit 20 kB/s | 833 B 00:00 2025-08-14T21:26:48.5081875Z Package nvidia-container-toolkit-1.17.8-1.x86_64 is already installed. 2025-08-14T21:26:48.5087642Z Package libnvidia-container-tools-1.17.8-1.x86_64 is already installed. 2025-08-14T21:26:48.5092222Z Package libnvidia-container1-1.17.8-1.x86_64 is already installed. 2025-08-14T21:26:48.5098269Z Package nvidia-container-toolkit-base-1.17.8-1.x86_64 is already installed. 2025-08-14T21:26:48.5572199Z Dependencies resolved. 2025-08-14T21:26:48.5796384Z Nothing to do. 2025-08-14T21:26:48.5796611Z Complete! 2025-08-14T21:26:48.6915084Z + sudo systemctl restart docker 2025-08-14T21:27:30.2060367Z Thu Aug 14 21:27:30 2025 2025-08-14T21:27:30.2062545Z +-----------------------------------------------------------------------------------------+ 2025-08-14T21:27:30.2064494Z | NVIDIA-SMI 570.133.07 Driver Version: 570.133.07 CUDA Version: 12.8 | 2025-08-14T21:27:30.2065540Z |-----------------------------------------+------------------------+----------------------+ 2025-08-14T21:27:30.2066224Z | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | 2025-08-14T21:27:30.2066881Z | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | 2025-08-14T21:27:30.2067423Z | | | MIG M. | 2025-08-14T21:27:30.2067850Z |=========================================+========================+======================| 2025-08-14T21:27:30.2144228Z | 0 NVIDIA A10G On | 00000000:00:1E.0 Off | 0 | 2025-08-14T21:27:30.2144791Z | 0% 35C P0 66W / 300W | 0MiB / 23028MiB | 4% Default | 2025-08-14T21:27:30.2145299Z | | | N/A | 2025-08-14T21:27:30.2145818Z +-----------------------------------------+------------------------+----------------------+ 2025-08-14T21:27:30.2146313Z 2025-08-14T21:27:30.2146807Z +-----------------------------------------------------------------------------------------+ 2025-08-14T21:27:30.2147736Z | Processes: | 2025-08-14T21:27:30.2148294Z | GPU GI CI PID Type Process name GPU Memory | 2025-08-14T21:27:30.2148809Z | ID ID Usage | 2025-08-14T21:27:30.2149245Z |=========================================================================================| 2025-08-14T21:27:30.2150087Z | No running processes found | 2025-08-14T21:27:30.2150703Z +-----------------------------------------------------------------------------------------+ 2025-08-14T21:27:30.3734039Z Unable to find image 'public.ecr.aws/docker/library/python:3.13' locally 2025-08-14T21:27:30.6488030Z 3.13: Pulling from docker/library/python 2025-08-14T21:27:30.7520983Z 80b7316254b3: Pulling fs layer 2025-08-14T21:27:30.7521328Z 36e4db86de6e: Pulling fs layer 2025-08-14T21:27:30.7521680Z 8ea45766c644: Pulling fs layer 2025-08-14T21:27:30.7521945Z 3cb1455cf185: Pulling fs layer 2025-08-14T21:27:30.7522226Z 775c3c87ce6e: Pulling fs layer 2025-08-14T21:27:30.7522481Z 363804934c90: Pulling fs layer 2025-08-14T21:27:30.7522743Z f597144abcb3: Pulling fs layer 2025-08-14T21:27:30.7522995Z 3cb1455cf185: Waiting 2025-08-14T21:27:30.7523215Z 363804934c90: Waiting 2025-08-14T21:27:30.7523438Z 775c3c87ce6e: Waiting 2025-08-14T21:27:30.7523664Z f597144abcb3: Waiting 2025-08-14T21:27:31.0401549Z 80b7316254b3: Verifying Checksum 2025-08-14T21:27:31.0402015Z 80b7316254b3: Download complete 2025-08-14T21:27:31.0736697Z 8ea45766c644: Verifying Checksum 2025-08-14T21:27:31.0737306Z 8ea45766c644: Download complete 2025-08-14T21:27:31.1560202Z 775c3c87ce6e: Verifying Checksum 2025-08-14T21:27:31.1560540Z 775c3c87ce6e: Download complete 2025-08-14T21:27:31.2987857Z 363804934c90: Verifying Checksum 2025-08-14T21:27:31.2988270Z 363804934c90: Download complete 2025-08-14T21:27:31.3144388Z 36e4db86de6e: Verifying Checksum 2025-08-14T21:27:31.3144735Z 36e4db86de6e: Download complete 2025-08-14T21:27:31.3443611Z f597144abcb3: Verifying Checksum 2025-08-14T21:27:31.3443958Z f597144abcb3: Download complete 2025-08-14T21:27:31.8398228Z 3cb1455cf185: Verifying Checksum 2025-08-14T21:27:31.8398678Z 3cb1455cf185: Download complete 2025-08-14T21:27:32.9836675Z 80b7316254b3: Pull complete 2025-08-14T21:27:33.6825365Z 36e4db86de6e: Pull complete 2025-08-14T21:27:36.1701336Z 8ea45766c644: Pull complete 2025-08-14T21:27:42.5812198Z 3cb1455cf185: Pull complete 2025-08-14T21:27:42.8643600Z 775c3c87ce6e: Pull complete 2025-08-14T21:27:43.6066382Z 363804934c90: Pull complete 2025-08-14T21:27:43.6315153Z f597144abcb3: Pull complete 2025-08-14T21:27:43.6451298Z Digest: sha256:50cbf8e58ca53a806b99250b1ba2d16c19433e8c42e7eb4ac4ea924b095e280b 2025-08-14T21:27:43.6492008Z Status: Downloaded newer image for public.ecr.aws/docker/library/python:3.13 2025-08-14T21:27:50.9315001Z Thu Aug 14 21:27:50 2025 2025-08-14T21:27:50.9317112Z +-----------------------------------------------------------------------------------------+ 2025-08-14T21:27:50.9317694Z | NVIDIA-SMI 570.133.07 Driver Version: 570.133.07 CUDA Version: 12.8 | 2025-08-14T21:27:50.9318303Z |-----------------------------------------+------------------------+----------------------+ 2025-08-14T21:27:50.9318808Z | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | 2025-08-14T21:27:50.9319593Z | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | 2025-08-14T21:27:50.9320146Z | | | MIG M. | 2025-08-14T21:27:50.9320485Z |=========================================+========================+======================| 2025-08-14T21:27:50.9457849Z | 0 NVIDIA A10G On | 00000000:00:1E.0 Off | 0 | 2025-08-14T21:27:50.9458364Z | 0% 33C P8 9W / 300W | 0MiB / 23028MiB | 0% Default | 2025-08-14T21:27:50.9458751Z | | | N/A | 2025-08-14T21:27:50.9459147Z +-----------------------------------------+------------------------+----------------------+ 2025-08-14T21:27:50.9461996Z 2025-08-14T21:27:50.9462479Z +-----------------------------------------------------------------------------------------+ 2025-08-14T21:27:50.9463149Z | Processes: | 2025-08-14T21:27:50.9463595Z | GPU GI CI PID Type Process name GPU Memory | 2025-08-14T21:27:50.9464004Z | ID ID Usage | 2025-08-14T21:27:50.9464352Z |=========================================================================================| 2025-08-14T21:27:50.9468275Z | No running processes found | 2025-08-14T21:27:50.9468746Z +-----------------------------------------------------------------------------------------+ 2025-08-14T21:27:52.3585369Z Command completed after 1 attempt(s). 2025-08-14T21:27:52.3684589Z Prepare all required actions 2025-08-14T21:27:52.3712708Z ##[group]Run ./.github/actions/get-workflow-job-id 2025-08-14T21:27:52.3713033Z with: 2025-08-14T21:27:52.3713634Z github-token: *** 2025-08-14T21:27:52.3713855Z env: 2025-08-14T21:27:52.3714062Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:27:52.3714393Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T21:27:52.3714728Z ##[endgroup] 2025-08-14T21:27:52.3733569Z ##[group]Run set -eux 2025-08-14T21:27:52.3733827Z set -eux 2025-08-14T21:27:52.3734246Z python3 .github/scripts/get_workflow_job_id.py "${GITHUB_RUN_ID}" "${RUNNER_NAME}" 2025-08-14T21:27:52.3749627Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:27:52.3749983Z env: 2025-08-14T21:27:52.3750196Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:27:52.3750512Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T21:27:52.3751000Z GITHUB_TOKEN: *** 2025-08-14T21:27:52.3751236Z ##[endgroup] 2025-08-14T21:27:52.3794268Z + python3 .github/scripts/get_workflow_job_id.py 16976255024 i-006ff0d0e48756a9a 2025-08-14T21:27:53.1499423Z Setting output job-id=48128162047 2025-08-14T21:27:53.1500015Z Setting output job-name=linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 3, 3, linux.g5.4xlarge.nvidia.gpu) 2025-08-14T21:27:53.1631868Z ##[group]Run python3 -m pip install psutil==5.9.8 dataclasses_json==0.6.7 nvidia-ml-py==11.525.84 2025-08-14T21:27:53.1632549Z python3 -m pip install psutil==5.9.8 dataclasses_json==0.6.7 nvidia-ml-py==11.525.84 2025-08-14T21:27:53.1633433Z python3 -m tools.stats.monitor --log-interval "$MONITOR_LOG_INTERVAL" --data-collect-interval "$MONITOR_DATA_COLLECT_INTERVAL" > usage_log.txt 2>&1 & 2025-08-14T21:27:53.1634231Z echo "monitor-script-pid=${!}" >> "${GITHUB_OUTPUT}" 2025-08-14T21:27:53.1643222Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:27:53.1643575Z env: 2025-08-14T21:27:53.1643788Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:27:53.1644109Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T21:27:53.1644584Z JOB_ID: 48128162047 2025-08-14T21:27:53.1645035Z JOB_NAME: linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 3, 3, linux.g5.4xlarge.nvidia.gpu) 2025-08-14T21:27:53.1645547Z WORKFLOW_NAME: slow 2025-08-14T21:27:53.1645807Z WORKFLOW_RUN_ID: 16976255024 2025-08-14T21:27:53.1646081Z MONITOR_LOG_INTERVAL: 5 2025-08-14T21:27:53.1646340Z MONITOR_DATA_COLLECT_INTERVAL: 1 2025-08-14T21:27:53.1646619Z ##[endgroup] 2025-08-14T21:27:53.4540798Z Defaulting to user installation because normal site-packages is not writeable 2025-08-14T21:27:53.8338064Z Collecting psutil==5.9.8 2025-08-14T21:27:53.8493771Z Downloading psutil-5.9.8-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (288 kB) 2025-08-14T21:27:53.9238750Z Collecting dataclasses_json==0.6.7 2025-08-14T21:27:53.9273161Z Downloading dataclasses_json-0.6.7-py3-none-any.whl (28 kB) 2025-08-14T21:27:53.9535509Z Collecting nvidia-ml-py==11.525.84 2025-08-14T21:27:53.9566224Z Downloading nvidia_ml_py-11.525.84-py3-none-any.whl (34 kB) 2025-08-14T21:27:53.9878151Z Collecting typing-inspect<1,>=0.4.0 2025-08-14T21:27:53.9907198Z Downloading typing_inspect-0.9.0-py3-none-any.whl (8.8 kB) 2025-08-14T21:27:54.1089689Z Collecting marshmallow<4.0.0,>=3.18.0 2025-08-14T21:27:54.1121481Z Downloading marshmallow-3.26.1-py3-none-any.whl (50 kB) 2025-08-14T21:27:54.1699307Z Collecting packaging>=17.0 2025-08-14T21:27:54.1731378Z Downloading packaging-25.0-py3-none-any.whl (66 kB) 2025-08-14T21:27:54.2249350Z Collecting typing-extensions>=3.7.4 2025-08-14T21:27:54.2280597Z Downloading typing_extensions-4.14.1-py3-none-any.whl (43 kB) 2025-08-14T21:27:54.2476417Z Collecting mypy-extensions>=0.3.0 2025-08-14T21:27:54.2505693Z Downloading mypy_extensions-1.1.0-py3-none-any.whl (5.0 kB) 2025-08-14T21:27:54.3413377Z Installing collected packages: typing-extensions, packaging, mypy-extensions, typing-inspect, marshmallow, psutil, nvidia-ml-py, dataclasses-json 2025-08-14T21:27:54.6140409Z Successfully installed dataclasses-json-0.6.7 marshmallow-3.26.1 mypy-extensions-1.1.0 nvidia-ml-py-11.525.84 packaging-25.0 psutil-5.9.8 typing-extensions-4.14.1 typing-inspect-0.9.0 2025-08-14T21:27:54.8032020Z Prepare all required actions 2025-08-14T21:27:54.8032390Z Getting action download info 2025-08-14T21:27:54.9843770Z Download action repository 'seemethere/download-artifact-s3@v4' (SHA:1da556a7aa0a088e3153970611f6c432d58e80e6) 2025-08-14T21:27:55.2601483Z Download action repository 'actions/download-artifact@v4' (SHA:d3f86a106a0bac45b974a628896c90dbdf5c8093) 2025-08-14T21:27:55.6782649Z ##[group]Run ./.github/actions/download-build-artifacts 2025-08-14T21:27:55.6782993Z with: 2025-08-14T21:27:55.6783247Z name: linux-jammy-cuda12.8-py3.10-gcc11-sm86 2025-08-14T21:27:55.6783572Z s3-bucket: gha-artifacts 2025-08-14T21:27:55.6783821Z env: 2025-08-14T21:27:55.6784027Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:27:55.6784364Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T21:27:55.6784728Z ##[endgroup] 2025-08-14T21:27:55.6863860Z ##[group]Run seemethere/download-artifact-s3@v4 2025-08-14T21:27:55.6864180Z with: 2025-08-14T21:27:55.6864448Z name: linux-jammy-cuda12.8-py3.10-gcc11-sm86 2025-08-14T21:27:55.6864768Z s3-bucket: gha-artifacts 2025-08-14T21:27:55.6865225Z region: us-east-1 2025-08-14T21:27:55.6865433Z env: 2025-08-14T21:27:55.6865631Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:27:55.6865949Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T21:27:55.6866283Z ##[endgroup] 2025-08-14T21:27:56.1871590Z (node:62560) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2025-08-14T21:27:56.1872472Z 2025-08-14T21:27:56.1872829Z Please migrate your code to use AWS SDK for JavaScript (v3). 2025-08-14T21:27:56.1873795Z For more information, check the migration guide at https://a.co/7PzMCcy 2025-08-14T21:27:56.1874662Z (Use `node --trace-warnings ...` to show where the warning was created) 2025-08-14T21:27:56.4837032Z Found 1 objects with prefix pytorch/pytorch/16976255024/linux-jammy-cuda12.8-py3.10-gcc11-sm86/ 2025-08-14T21:27:56.4838474Z Starting download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/artifacts.zip 2025-08-14T21:28:11.5056511Z Finished download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/artifacts.zip 2025-08-14T21:28:11.5062025Z Artifact download has finished successfully 2025-08-14T21:28:11.5425914Z ##[group]Run unzip -o artifacts.zip 2025-08-14T21:28:11.5426244Z unzip -o artifacts.zip 2025-08-14T21:28:11.5436117Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:28:11.5436468Z env: 2025-08-14T21:28:11.5436682Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:28:11.5437008Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T21:28:11.5437340Z ##[endgroup] 2025-08-14T21:28:11.5516545Z Archive: artifacts.zip 2025-08-14T21:28:11.5518025Z creating: dist/ 2025-08-14T21:28:13.5663081Z inflating: dist/torch-2.9.0a0+git1fc683c-cp310-cp310-linux_x86_64.whl 2025-08-14T21:28:13.5800404Z inflating: dist/.ninja_log 2025-08-14T21:28:13.5800740Z creating: build/custom_test_artifacts/ 2025-08-14T21:28:13.5801135Z creating: build/custom_test_artifacts/custom-op-build/ 2025-08-14T21:28:13.5801612Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/ 2025-08-14T21:28:13.5802171Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/pkgRedirects/ 2025-08-14T21:28:13.5810966Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeConfigureLog.yaml 2025-08-14T21:28:13.5811589Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/ 2025-08-14T21:28:13.5812199Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeSystem.cmake 2025-08-14T21:28:13.5812856Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdC/ 2025-08-14T21:28:13.5813490Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdC/tmp/ 2025-08-14T21:28:13.5816102Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdC/CMakeCCompilerId.c 2025-08-14T21:28:13.5817858Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdC/a.out 2025-08-14T21:28:13.5818595Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeCCompiler.cmake 2025-08-14T21:28:13.5819271Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCXX/ 2025-08-14T21:28:13.5819924Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCXX/tmp/ 2025-08-14T21:28:13.5823019Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-08-14T21:28:13.5824334Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCXX/a.out 2025-08-14T21:28:13.5825661Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeCXXCompiler.cmake 2025-08-14T21:28:13.5827870Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_C.bin 2025-08-14T21:28:13.5829973Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_CXX.bin 2025-08-14T21:28:13.5830914Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/ 2025-08-14T21:28:13.5831583Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/ 2025-08-14T21:28:13.5891798Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2025-08-14T21:28:13.5953575Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2025-08-14T21:28:13.5955496Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2025-08-14T21:28:13.6018778Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2025-08-14T21:28:13.6019735Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2025-08-14T21:28:13.6020971Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2025-08-14T21:28:13.6021978Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2025-08-14T21:28:13.6022935Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2025-08-14T21:28:13.6023871Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2025-08-14T21:28:13.6024963Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2025-08-14T21:28:13.6026165Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2025-08-14T21:28:13.6027783Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2025-08-14T21:28:13.6028650Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2025-08-14T21:28:13.6029671Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.reg.c 2025-08-14T21:28:13.6031269Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.fatbin 2025-08-14T21:28:13.6032483Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2025-08-14T21:28:13.6033647Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.o 2025-08-14T21:28:13.6036810Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/CMakeCUDACompilerId.cu 2025-08-14T21:28:13.6112375Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CompilerIdCUDA/a.out 2025-08-14T21:28:13.6113116Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeCUDACompiler.cmake 2025-08-14T21:28:13.6188799Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_CUDA.bin 2025-08-14T21:28:13.6189522Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeScratch/ 2025-08-14T21:28:13.6190133Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeTmp/ 2025-08-14T21:28:13.6191333Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/cmake.check_cache 2025-08-14T21:28:13.6192551Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/ 2025-08-14T21:28:13.6193879Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.ts 2025-08-14T21:28:13.6195422Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.make 2025-08-14T21:28:13.6197161Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/depend.make 2025-08-14T21:28:13.6198535Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/link.txt 2025-08-14T21:28:13.6199925Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/cmake_clean.cmake 2025-08-14T21:28:13.6200792Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/build.make 2025-08-14T21:28:13.6201505Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/DependInfo.cmake 2025-08-14T21:28:13.6202219Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/flags.make 2025-08-14T21:28:13.6202927Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/progress.make 2025-08-14T21:28:13.6222311Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o.d 2025-08-14T21:28:13.6416091Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o 2025-08-14T21:28:13.6416762Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/ 2025-08-14T21:28:13.6417487Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.ts 2025-08-14T21:28:13.6418299Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.make 2025-08-14T21:28:13.6419070Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/depend.make 2025-08-14T21:28:13.6419795Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/link.txt 2025-08-14T21:28:13.6420977Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/cmake_clean.cmake 2025-08-14T21:28:13.6421824Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/build.make 2025-08-14T21:28:13.6422982Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/DependInfo.cmake 2025-08-14T21:28:13.6423745Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/flags.make 2025-08-14T21:28:13.6424742Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/progress.make 2025-08-14T21:28:13.6446508Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o.d 2025-08-14T21:28:13.6528726Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o 2025-08-14T21:28:13.6530191Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-08-14T21:28:13.6530919Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/TargetDirectories.txt 2025-08-14T21:28:13.6531811Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/progress.marks 2025-08-14T21:28:13.6532429Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile2 2025-08-14T21:28:13.6533040Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile.cmake 2025-08-14T21:28:13.6533942Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/InstallScripts.json 2025-08-14T21:28:13.6534624Z inflating: build/custom_test_artifacts/custom-op-build/detect_cuda_version.cc 2025-08-14T21:28:13.6537933Z inflating: build/custom_test_artifacts/custom-op-build/CMakeCache.txt 2025-08-14T21:28:13.6538825Z inflating: build/custom_test_artifacts/custom-op-build/Makefile 2025-08-14T21:28:13.6539705Z inflating: build/custom_test_artifacts/custom-op-build/cmake_install.cmake 2025-08-14T21:28:13.6705419Z inflating: build/custom_test_artifacts/custom-op-build/libcustom_ops.so 2025-08-14T21:28:13.6762296Z inflating: build/custom_test_artifacts/custom-op-build/test_custom_ops 2025-08-14T21:28:13.6762788Z creating: build/custom_test_artifacts/jit-hook-build/ 2025-08-14T21:28:13.6763440Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/ 2025-08-14T21:28:13.6763975Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/pkgRedirects/ 2025-08-14T21:28:13.6772106Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeConfigureLog.yaml 2025-08-14T21:28:13.6772719Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/ 2025-08-14T21:28:13.6773315Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeSystem.cmake 2025-08-14T21:28:13.6773958Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdC/ 2025-08-14T21:28:13.6774583Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdC/tmp/ 2025-08-14T21:28:13.6777220Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdC/CMakeCCompilerId.c 2025-08-14T21:28:13.6778610Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdC/a.out 2025-08-14T21:28:13.6779703Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeCCompiler.cmake 2025-08-14T21:28:13.6780361Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCXX/ 2025-08-14T21:28:13.6781000Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCXX/tmp/ 2025-08-14T21:28:13.6784006Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-08-14T21:28:13.6785367Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCXX/a.out 2025-08-14T21:28:13.6786684Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeCXXCompiler.cmake 2025-08-14T21:28:13.6788859Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_C.bin 2025-08-14T21:28:13.6791067Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_CXX.bin 2025-08-14T21:28:13.6791805Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/ 2025-08-14T21:28:13.6792452Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/ 2025-08-14T21:28:13.6852740Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2025-08-14T21:28:13.6914967Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2025-08-14T21:28:13.6916707Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2025-08-14T21:28:13.6980567Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2025-08-14T21:28:13.6982634Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2025-08-14T21:28:13.6984533Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2025-08-14T21:28:13.6986470Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2025-08-14T21:28:13.6988349Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2025-08-14T21:28:13.6990171Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2025-08-14T21:28:13.6991090Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2025-08-14T21:28:13.6992013Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2025-08-14T21:28:13.6992913Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2025-08-14T21:28:13.6993885Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2025-08-14T21:28:13.6994698Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.reg.c 2025-08-14T21:28:13.6995500Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.fatbin 2025-08-14T21:28:13.6996317Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2025-08-14T21:28:13.6997105Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.o 2025-08-14T21:28:13.6997901Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/CMakeCUDACompilerId.cu 2025-08-14T21:28:13.7073031Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CompilerIdCUDA/a.out 2025-08-14T21:28:13.7074333Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeCUDACompiler.cmake 2025-08-14T21:28:13.7149996Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_CUDA.bin 2025-08-14T21:28:13.7150829Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeScratch/ 2025-08-14T21:28:13.7151383Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeTmp/ 2025-08-14T21:28:13.7151958Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/cmake.check_cache 2025-08-14T21:28:13.7152565Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/ 2025-08-14T21:28:13.7153259Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.ts 2025-08-14T21:28:13.7154064Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.make 2025-08-14T21:28:13.7154827Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/depend.make 2025-08-14T21:28:13.7155526Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/link.txt 2025-08-14T21:28:13.7156249Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/cmake_clean.cmake 2025-08-14T21:28:13.7156977Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/build.make 2025-08-14T21:28:13.7158322Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/DependInfo.cmake 2025-08-14T21:28:13.7159813Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/flags.make 2025-08-14T21:28:13.7160781Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/progress.make 2025-08-14T21:28:13.7182643Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o.d 2025-08-14T21:28:13.7246120Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o 2025-08-14T21:28:13.7246904Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-08-14T21:28:13.7247983Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/TargetDirectories.txt 2025-08-14T21:28:13.7248767Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/progress.marks 2025-08-14T21:28:13.7249962Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile2 2025-08-14T21:28:13.7252026Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile.cmake 2025-08-14T21:28:13.7252653Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/InstallScripts.json 2025-08-14T21:28:13.7253333Z inflating: build/custom_test_artifacts/jit-hook-build/detect_cuda_version.cc 2025-08-14T21:28:13.7256570Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeCache.txt 2025-08-14T21:28:13.7257313Z inflating: build/custom_test_artifacts/jit-hook-build/Makefile 2025-08-14T21:28:13.7258386Z inflating: build/custom_test_artifacts/jit-hook-build/cmake_install.cmake 2025-08-14T21:28:13.7297791Z inflating: build/custom_test_artifacts/jit-hook-build/test_jit_hooks 2025-08-14T21:28:13.7298319Z creating: build/custom_test_artifacts/custom-backend-build/ 2025-08-14T21:28:13.7298986Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/ 2025-08-14T21:28:13.7299712Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/pkgRedirects/ 2025-08-14T21:28:13.7308212Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeConfigureLog.yaml 2025-08-14T21:28:13.7308867Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/ 2025-08-14T21:28:13.7309521Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeSystem.cmake 2025-08-14T21:28:13.7310241Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdC/ 2025-08-14T21:28:13.7310945Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdC/tmp/ 2025-08-14T21:28:13.7313337Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdC/CMakeCCompilerId.c 2025-08-14T21:28:13.7314806Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdC/a.out 2025-08-14T21:28:13.7315899Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeCCompiler.cmake 2025-08-14T21:28:13.7316614Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCXX/ 2025-08-14T21:28:13.7317301Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCXX/tmp/ 2025-08-14T21:28:13.7320125Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-08-14T21:28:13.7321589Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCXX/a.out 2025-08-14T21:28:13.7322898Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeCXXCompiler.cmake 2025-08-14T21:28:13.7325072Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_C.bin 2025-08-14T21:28:13.7327251Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_CXX.bin 2025-08-14T21:28:13.7328035Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/ 2025-08-14T21:28:13.7328729Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/ 2025-08-14T21:28:13.7388988Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp4.ii 2025-08-14T21:28:13.7450009Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.cpp 2025-08-14T21:28:13.7451029Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.module_id 2025-08-14T21:28:13.7516128Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cpp1.ii 2025-08-14T21:28:13.7517311Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.c 2025-08-14T21:28:13.7518528Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.gpu 2025-08-14T21:28:13.7519766Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.cudafe1.stub.c 2025-08-14T21:28:13.7520968Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx 2025-08-14T21:28:13.7522081Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.sm_52.cubin 2025-08-14T21:28:13.7523061Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin 2025-08-14T21:28:13.7524029Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.fatbin.c 2025-08-14T21:28:13.7525272Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/CMakeCUDACompilerId.o 2025-08-14T21:28:13.7526184Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.sm_52.cubin 2025-08-14T21:28:13.7527173Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.reg.c 2025-08-14T21:28:13.7528301Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.fatbin 2025-08-14T21:28:13.7529480Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.fatbin.c 2025-08-14T21:28:13.7530685Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/tmp/a_dlink.o 2025-08-14T21:28:13.7533718Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/CMakeCUDACompilerId.cu 2025-08-14T21:28:13.7609511Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CompilerIdCUDA/a.out 2025-08-14T21:28:13.7611026Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeCUDACompiler.cmake 2025-08-14T21:28:13.7686551Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/4.0.0/CMakeDetermineCompilerABI_CUDA.bin 2025-08-14T21:28:13.7687315Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeScratch/ 2025-08-14T21:28:13.7687923Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeTmp/ 2025-08-14T21:28:13.7688559Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/cmake.check_cache 2025-08-14T21:28:13.7689230Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/ 2025-08-14T21:28:13.7689984Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.ts 2025-08-14T21:28:13.7690882Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.make 2025-08-14T21:28:13.7691701Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/depend.make 2025-08-14T21:28:13.7692474Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/link.txt 2025-08-14T21:28:13.7693417Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/cmake_clean.cmake 2025-08-14T21:28:13.7694225Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/build.make 2025-08-14T21:28:13.7695014Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/DependInfo.cmake 2025-08-14T21:28:13.7696117Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/flags.make 2025-08-14T21:28:13.7697084Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/progress.make 2025-08-14T21:28:13.7701682Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o.d 2025-08-14T21:28:13.7823128Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o 2025-08-14T21:28:13.7824692Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/ 2025-08-14T21:28:13.7826604Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.ts 2025-08-14T21:28:13.7828353Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.make 2025-08-14T21:28:13.7830063Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/depend.make 2025-08-14T21:28:13.7841841Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/link.txt 2025-08-14T21:28:13.7842708Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/cmake_clean.cmake 2025-08-14T21:28:13.7843547Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/build.make 2025-08-14T21:28:13.7844388Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/DependInfo.cmake 2025-08-14T21:28:13.7845334Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/flags.make 2025-08-14T21:28:13.7846158Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/progress.make 2025-08-14T21:28:13.7853511Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o.d 2025-08-14T21:28:13.7908256Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o 2025-08-14T21:28:13.7909154Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-08-14T21:28:13.7909986Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/TargetDirectories.txt 2025-08-14T21:28:13.7911332Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/progress.marks 2025-08-14T21:28:13.7912394Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile2 2025-08-14T21:28:13.7914162Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile.cmake 2025-08-14T21:28:13.7915519Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/InstallScripts.json 2025-08-14T21:28:13.7916836Z inflating: build/custom_test_artifacts/custom-backend-build/detect_cuda_version.cc 2025-08-14T21:28:13.7918630Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeCache.txt 2025-08-14T21:28:13.7919745Z inflating: build/custom_test_artifacts/custom-backend-build/Makefile 2025-08-14T21:28:13.7920484Z inflating: build/custom_test_artifacts/custom-backend-build/cmake_install.cmake 2025-08-14T21:28:13.8023409Z inflating: build/custom_test_artifacts/custom-backend-build/libcustom_backend.so 2025-08-14T21:28:13.8063194Z inflating: build/custom_test_artifacts/custom-backend-build/test_custom_backend 2025-08-14T21:28:13.8063839Z creating: build/lib/ 2025-08-14T21:28:13.8146639Z inflating: build/lib/libprotobuf-lite.a 2025-08-14T21:28:13.8592020Z inflating: build/lib/libprotobuf.a 2025-08-14T21:28:13.9088154Z inflating: build/lib/libprotoc.a 2025-08-14T21:28:13.9098033Z inflating: build/lib/libpthreadpool.a 2025-08-14T21:28:13.9106524Z inflating: build/lib/libcpuinfo.a 2025-08-14T21:28:13.9114550Z inflating: build/lib/libcpuinfo_internals.a 2025-08-14T21:28:13.9115506Z inflating: build/lib/libclog.a 2025-08-14T21:28:13.9134936Z inflating: build/lib/libpytorch_qnnpack.a 2025-08-14T21:28:13.9137696Z inflating: build/lib/libnnpack_reference_layers.a 2025-08-14T21:28:13.9156080Z inflating: build/lib/libnnpack.a 2025-08-14T21:28:13.9344453Z inflating: build/lib/libmicrokernels-prod.a 2025-08-14T21:28:14.0226834Z inflating: build/lib/libmicrokernels-all.a 2025-08-14T21:28:14.0297277Z inflating: build/lib/libgtest.a 2025-08-14T21:28:14.0314636Z inflating: build/lib/libgmock.a 2025-08-14T21:28:14.0315526Z inflating: build/lib/libgtest_main.a 2025-08-14T21:28:14.0316644Z inflating: build/lib/libgmock_main.a 2025-08-14T21:28:14.0408301Z inflating: build/lib/libXNNPACK.a 2025-08-14T21:28:14.0483639Z inflating: build/lib/libbenchmark.a 2025-08-14T21:28:14.0484231Z inflating: build/lib/libbenchmark_main.a 2025-08-14T21:28:14.0485896Z inflating: build/lib/libjitprofiling.a 2025-08-14T21:28:14.0552135Z inflating: build/lib/libasmjit.a 2025-08-14T21:28:14.0560433Z inflating: build/lib/libittnotify.a 2025-08-14T21:28:14.1750831Z inflating: build/lib/libfbgemm.a 2025-08-14T21:28:14.1782170Z inflating: build/lib/libtensorpipe_uv.a 2025-08-14T21:28:14.2337283Z inflating: build/lib/libtensorpipe.a 2025-08-14T21:28:14.2586035Z inflating: build/lib/libtensorpipe_cuda.a 2025-08-14T21:28:14.2721083Z inflating: build/lib/libgloo.a 2025-08-14T21:28:14.2768386Z inflating: build/lib/libonnx_proto.a 2025-08-14T21:28:14.3214309Z inflating: build/lib/libgloo_cuda.a 2025-08-14T21:28:14.3934300Z inflating: build/lib/libonnx.a 2025-08-14T21:28:15.4226453Z inflating: build/lib/libdnnl.a 2025-08-14T21:28:15.4245790Z inflating: build/lib/libfmt.a 2025-08-14T21:28:15.4696010Z inflating: build/lib/libkineto.a 2025-08-14T21:28:15.4806812Z inflating: build/lib/libc10.so 2025-08-14T21:28:15.4850832Z inflating: build/lib/libc10_cuda.so 2025-08-14T21:28:15.4852756Z inflating: build/lib/libcaffe2_nvrtc.so 2025-08-14T21:28:15.4854370Z inflating: build/lib/libtorch_global_deps.so 2025-08-14T21:28:18.5251841Z inflating: build/lib/libtorch_cpu.so 2025-08-14T21:28:18.6009324Z inflating: build/lib/libtorch_nvshmem.so 2025-08-14T21:28:20.5571898Z inflating: build/lib/libtorch_cuda.so 2025-08-14T21:28:20.5573090Z inflating: build/lib/libtorch.so 2025-08-14T21:28:20.5625407Z inflating: build/lib/libtorch_cuda_linalg.so 2025-08-14T21:28:20.5643714Z inflating: build/lib/libjitbackend_test.so 2025-08-14T21:28:20.5716116Z inflating: build/lib/libtorchbind_test.so 2025-08-14T21:28:20.5740234Z inflating: build/lib/libbackend_with_compiler.so 2025-08-14T21:28:20.5767729Z inflating: build/lib/libaoti_custom_ops.so 2025-08-14T21:28:20.5770767Z inflating: build/lib/libc10d_cuda_test.so 2025-08-14T21:28:20.5774957Z inflating: build/lib/libshm.so 2025-08-14T21:28:20.7859913Z inflating: build/lib/libtorch_python.so 2025-08-14T21:28:20.7894013Z inflating: build/lib/libnnapi_backend.so 2025-08-14T21:28:20.7894916Z creating: build/bin/ 2025-08-14T21:28:20.8354199Z inflating: build/bin/protoc-3.13.0.0 2025-08-14T21:28:20.8812418Z inflating: build/bin/protoc 2025-08-14T21:28:20.8872051Z inflating: build/bin/c10_AllocatorConfig_test 2025-08-14T21:28:20.8928214Z inflating: build/bin/c10_CompileTimeFunctionPointer_test 2025-08-14T21:28:20.8986371Z inflating: build/bin/c10_DeviceGuard_test 2025-08-14T21:28:20.9044409Z inflating: build/bin/c10_Device_test 2025-08-14T21:28:20.9111962Z inflating: build/bin/c10_DispatchKeySet_test 2025-08-14T21:28:20.9166174Z inflating: build/bin/c10_StreamGuard_test 2025-08-14T21:28:20.9226047Z inflating: build/bin/c10_Scalar_test 2025-08-14T21:28:20.9283546Z inflating: build/bin/c10_SymInt_test 2025-08-14T21:28:20.9344309Z inflating: build/bin/c10_InlineDeviceGuard_test 2025-08-14T21:28:20.9406181Z inflating: build/bin/c10_InlineStreamGuard_test 2025-08-14T21:28:20.9468581Z inflating: build/bin/c10_SizesAndStrides_test 2025-08-14T21:28:20.9523806Z inflating: build/bin/c10_ArrayRef_test 2025-08-14T21:28:20.9600979Z inflating: build/bin/c10_cow_test 2025-08-14T21:28:20.9656307Z inflating: build/bin/c10_ConstexprCrc_test 2025-08-14T21:28:20.9715353Z inflating: build/bin/c10_Bitset_test 2025-08-14T21:28:20.9771188Z inflating: build/bin/c10_DeadlockDetection_test 2025-08-14T21:28:20.9834889Z inflating: build/bin/c10_Enumerate_test 2025-08-14T21:28:20.9891512Z inflating: build/bin/c10_Half_test 2025-08-14T21:28:20.9950540Z inflating: build/bin/c10_IntrusiveList_test 2025-08-14T21:28:21.0012746Z inflating: build/bin/c10_LeftRight_test 2025-08-14T21:28:21.0075302Z inflating: build/bin/c10_Metaprogramming_test 2025-08-14T21:28:21.0134235Z inflating: build/bin/c10_NetworkFlow_test 2025-08-14T21:28:21.0190210Z inflating: build/bin/c10_Semaphore_test 2025-08-14T21:28:21.0247382Z inflating: build/bin/c10_Synchronized_test 2025-08-14T21:28:21.0305808Z inflating: build/bin/c10_TypeIndex_test 2025-08-14T21:28:21.0362410Z inflating: build/bin/c10_TypeList_test 2025-08-14T21:28:21.0424151Z inflating: build/bin/c10_ThreadLocal_test 2025-08-14T21:28:21.0479110Z inflating: build/bin/c10_TypeTraits_test 2025-08-14T21:28:21.0536881Z inflating: build/bin/c10_accumulate_test 2025-08-14T21:28:21.0598958Z inflating: build/bin/c10_bfloat16_test 2025-08-14T21:28:21.0655792Z inflating: build/bin/c10_bit_cast_test 2025-08-14T21:28:21.0711194Z inflating: build/bin/c10_error_test 2025-08-14T21:28:21.0774369Z inflating: build/bin/c10_complex_math_test 2025-08-14T21:28:21.0835703Z inflating: build/bin/c10_complex_test 2025-08-14T21:28:21.0894889Z inflating: build/bin/c10_exception_test 2025-08-14T21:28:21.0951660Z inflating: build/bin/c10_flags_test 2025-08-14T21:28:21.1007535Z inflating: build/bin/c10_generic_math_test 2025-08-14T21:28:21.1067522Z inflating: build/bin/c10_lazy_test 2025-08-14T21:28:21.1124365Z inflating: build/bin/c10_irange_test 2025-08-14T21:28:21.1188737Z inflating: build/bin/c10_logging_test 2025-08-14T21:28:21.1364063Z inflating: build/bin/c10_intrusive_ptr_test 2025-08-14T21:28:21.1446578Z inflating: build/bin/c10_optional_test 2025-08-14T21:28:21.1515301Z inflating: build/bin/c10_ordered_preserving_dict_test 2025-08-14T21:28:21.1575045Z inflating: build/bin/c10_registry_test 2025-08-14T21:28:21.1740399Z inflating: build/bin/c10_small_vector_test 2025-08-14T21:28:21.1798257Z inflating: build/bin/c10_ssize_test 2025-08-14T21:28:21.1861088Z inflating: build/bin/c10_string_util_test 2025-08-14T21:28:21.1916007Z inflating: build/bin/c10_string_view_test 2025-08-14T21:28:21.1979138Z inflating: build/bin/c10_typeid_test 2025-08-14T21:28:21.2027534Z inflating: build/bin/c10_intrusive_ptr_benchmark 2025-08-14T21:28:21.2083859Z inflating: build/bin/c10_tempfile_test 2025-08-14T21:28:21.2143219Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_catches_stream 2025-08-14T21:28:21.2202682Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_catches_thread_and_block_and_device 2025-08-14T21:28:21.2261130Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_multiple_writes_from_blocks_and_threads 2025-08-14T21:28:21.2319635Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_1_var_test 2025-08-14T21:28:21.2378546Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_multiple_writes_from_multiple_blocks 2025-08-14T21:28:21.2437323Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_from_2_processes 2025-08-14T21:28:21.2492542Z inflating: build/bin/c10_cuda_CUDATest 2025-08-14T21:28:21.2551445Z inflating: build/bin/c10_cuda_CUDAAssertionsTest_multiple_writes_from_same_block 2025-08-14T21:28:21.3164053Z inflating: build/bin/vec_test_all_types_DEFAULT 2025-08-14T21:28:21.3794037Z inflating: build/bin/vec_test_all_types_AVX512 2025-08-14T21:28:21.4432424Z inflating: build/bin/vec_test_all_types_AVX2 2025-08-14T21:28:21.4513571Z inflating: build/bin/Dict_test 2025-08-14T21:28:21.4572405Z inflating: build/bin/Dimname_test 2025-08-14T21:28:21.4644045Z inflating: build/bin/MaybeOwned_test 2025-08-14T21:28:21.4708082Z inflating: build/bin/NamedTensor_test 2025-08-14T21:28:21.4772916Z inflating: build/bin/apply_utils_test 2025-08-14T21:28:21.4838351Z inflating: build/bin/atest 2025-08-14T21:28:21.4909588Z inflating: build/bin/basic 2025-08-14T21:28:21.4971183Z inflating: build/bin/broadcast_test 2025-08-14T21:28:21.5027951Z inflating: build/bin/cpu_allocator_test 2025-08-14T21:28:21.5093078Z inflating: build/bin/cpu_generator_test 2025-08-14T21:28:21.5151994Z inflating: build/bin/cpu_profiling_allocator_test 2025-08-14T21:28:21.5252290Z inflating: build/bin/cpu_rng_test 2025-08-14T21:28:21.5309694Z inflating: build/bin/dlconvertor_test 2025-08-14T21:28:21.5373718Z inflating: build/bin/extension_backend_test 2025-08-14T21:28:21.5436222Z inflating: build/bin/half_test 2025-08-14T21:28:21.5539914Z inflating: build/bin/ivalue_test 2025-08-14T21:28:21.5596255Z inflating: build/bin/lazy_tensor_test 2025-08-14T21:28:21.5656384Z inflating: build/bin/math_kernel_test 2025-08-14T21:28:21.5716053Z inflating: build/bin/memory_format_test 2025-08-14T21:28:21.5776477Z inflating: build/bin/memory_overlapping_test 2025-08-14T21:28:21.5835485Z inflating: build/bin/mobile_memory_cleanup 2025-08-14T21:28:21.5898173Z inflating: build/bin/native_test 2025-08-14T21:28:21.5955041Z inflating: build/bin/operator_name_test 2025-08-14T21:28:21.6011828Z inflating: build/bin/operators_test 2025-08-14T21:28:21.6070477Z inflating: build/bin/packedtensoraccessor_test 2025-08-14T21:28:21.6144586Z inflating: build/bin/pow_test 2025-08-14T21:28:21.6208317Z inflating: build/bin/quantized_test 2025-08-14T21:28:21.6264516Z inflating: build/bin/reduce_ops_test 2025-08-14T21:28:21.6322050Z inflating: build/bin/reportMemoryUsage_test 2025-08-14T21:28:21.6384801Z inflating: build/bin/scalar_tensor_test 2025-08-14T21:28:21.6450414Z inflating: build/bin/scalar_test 2025-08-14T21:28:21.6507557Z inflating: build/bin/StorageUtils_test 2025-08-14T21:28:21.6566215Z inflating: build/bin/stride_properties_test 2025-08-14T21:28:21.6652683Z inflating: build/bin/tensor_iterator_test 2025-08-14T21:28:21.6714269Z inflating: build/bin/test_parallel 2025-08-14T21:28:21.6771184Z inflating: build/bin/thread_init_test 2025-08-14T21:28:21.6832179Z inflating: build/bin/type_ptr_test 2025-08-14T21:28:21.6898225Z inflating: build/bin/type_test 2025-08-14T21:28:21.6956767Z inflating: build/bin/undefined_tensor_test 2025-08-14T21:28:21.7012222Z inflating: build/bin/verify_api_visibility 2025-08-14T21:28:21.7089037Z inflating: build/bin/legacy_vmap_test 2025-08-14T21:28:21.7146214Z inflating: build/bin/weakref_test 2025-08-14T21:28:21.7204181Z inflating: build/bin/wrapdim_test 2025-08-14T21:28:21.7261850Z inflating: build/bin/xla_tensor_test 2025-08-14T21:28:21.7328026Z inflating: build/bin/IListRef_test 2025-08-14T21:28:21.7442634Z inflating: build/bin/List_test 2025-08-14T21:28:21.7516135Z inflating: build/bin/KernelFunction_test 2025-08-14T21:28:21.7644361Z inflating: build/bin/kernel_function_legacy_test 2025-08-14T21:28:21.7747759Z inflating: build/bin/kernel_function_test 2025-08-14T21:28:21.7882401Z inflating: build/bin/kernel_lambda_legacy_test 2025-08-14T21:28:21.7992840Z inflating: build/bin/kernel_lambda_test 2025-08-14T21:28:21.8059481Z inflating: build/bin/kernel_stackbased_test 2025-08-14T21:28:21.8162176Z inflating: build/bin/make_boxed_from_unboxed_functor_test 2025-08-14T21:28:21.8219439Z inflating: build/bin/CppSignature_test 2025-08-14T21:28:21.8281013Z inflating: build/bin/backend_fallback_test 2025-08-14T21:28:21.8335855Z inflating: build/bin/op_allowlist_test 2025-08-14T21:28:21.8662887Z inflating: build/bin/op_registration_test 2025-08-14T21:28:21.8736149Z inflating: build/bin/inline_container_test 2025-08-14T21:28:21.8794305Z inflating: build/bin/cuda_allocator_test 2025-08-14T21:28:21.8853531Z inflating: build/bin/cuda_apply_test 2025-08-14T21:28:21.8919786Z inflating: build/bin/cuda_atomic_ops_test 2025-08-14T21:28:21.8982695Z inflating: build/bin/cuda_caching_host_allocator_test 2025-08-14T21:28:21.9060084Z inflating: build/bin/cuda_complex_math_test 2025-08-14T21:28:21.9126772Z inflating: build/bin/cuda_complex_test 2025-08-14T21:28:21.9196502Z inflating: build/bin/cuda_cub_test 2025-08-14T21:28:21.9252843Z inflating: build/bin/cuda_device_test 2025-08-14T21:28:21.9324731Z inflating: build/bin/cuda_distributions_test 2025-08-14T21:28:21.9379870Z inflating: build/bin/cuda_exchange_device_test 2025-08-14T21:28:21.9437665Z inflating: build/bin/cuda_dlconvertor_test 2025-08-14T21:28:21.9496475Z inflating: build/bin/cuda_reportMemoryUsage_test 2025-08-14T21:28:21.9553812Z inflating: build/bin/cuda_integer_divider_test 2025-08-14T21:28:21.9609577Z inflating: build/bin/cuda_allocatorTraceTracker_test 2025-08-14T21:28:21.9677752Z inflating: build/bin/cuda_stream_test 2025-08-14T21:28:21.9740873Z inflating: build/bin/cuda_generator_test 2025-08-14T21:28:21.9796790Z inflating: build/bin/cuda_cudnn_test 2025-08-14T21:28:21.9853223Z inflating: build/bin/cuda_half_test 2025-08-14T21:28:21.9909289Z inflating: build/bin/cuda_optional_test 2025-08-14T21:28:21.9967001Z inflating: build/bin/cuda_packedtensoraccessor_test 2025-08-14T21:28:22.0025603Z inflating: build/bin/cuda_vectorized_test 2025-08-14T21:28:22.1160141Z inflating: build/bin/test_jit 2025-08-14T21:28:22.1489676Z inflating: build/bin/test_nativert 2025-08-14T21:28:22.1548751Z inflating: build/bin/BackoffTest 2025-08-14T21:28:22.1608290Z inflating: build/bin/FileStoreTest 2025-08-14T21:28:22.1671802Z inflating: build/bin/TCPStoreTest 2025-08-14T21:28:22.1732333Z inflating: build/bin/HashStoreTest 2025-08-14T21:28:22.1746844Z inflating: build/bin/ProcessGroupMPITest 2025-08-14T21:28:22.1750518Z inflating: build/bin/example_allreduce 2025-08-14T21:28:22.1811955Z inflating: build/bin/test_dist_autograd 2025-08-14T21:28:22.1885395Z inflating: build/bin/ProcessGroupGlooTest 2025-08-14T21:28:22.1948629Z inflating: build/bin/ProcessGroupGlooAsyncTest 2025-08-14T21:28:22.2018653Z inflating: build/bin/ProcessGroupNCCLTest 2025-08-14T21:28:22.2086425Z inflating: build/bin/ProcessGroupNCCLErrorsTest 2025-08-14T21:28:22.2160824Z inflating: build/bin/test_cpp_rpc 2025-08-14T21:28:22.3351963Z inflating: build/bin/test_api 2025-08-14T21:28:22.3354793Z inflating: build/bin/parallel_benchmark 2025-08-14T21:28:22.3715786Z inflating: build/bin/test_lazy 2025-08-14T21:28:22.3720054Z inflating: build/bin/torch_shm_manager 2025-08-14T21:28:22.3720381Z creating: .additional_ci_files/ 2025-08-14T21:28:22.3802154Z inflating: .additional_ci_files/test-times.json 2025-08-14T21:28:22.4108608Z inflating: .additional_ci_files/test-class-times.json 2025-08-14T21:28:22.4143022Z ##[group]Run rm artifacts.zip 2025-08-14T21:28:22.4143329Z rm artifacts.zip 2025-08-14T21:28:22.4153644Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:28:22.4154026Z env: 2025-08-14T21:28:22.4154243Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:28:22.4154586Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T21:28:22.4154952Z ##[endgroup] 2025-08-14T21:28:22.7903623Z ##[group]Run df -H 2025-08-14T21:28:22.7903879Z df -H 2025-08-14T21:28:22.7912757Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:28:22.7913107Z env: 2025-08-14T21:28:22.7913357Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:28:22.7913692Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T21:28:22.7914026Z ##[endgroup] 2025-08-14T21:28:22.7968282Z Filesystem Size Used Avail Use% Mounted on 2025-08-14T21:28:22.7968655Z devtmpfs 4.2M 0 4.2M 0% /dev 2025-08-14T21:28:22.7969032Z tmpfs 34G 0 34G 0% /dev/shm 2025-08-14T21:28:22.7969480Z tmpfs 14G 553k 14G 1% /run 2025-08-14T21:28:22.7969904Z /dev/nvme0n1p1 161G 53G 109G 33% / 2025-08-14T21:28:22.7980842Z tmpfs 34G 13k 34G 1% /tmp 2025-08-14T21:28:22.7981421Z /dev/nvme0n1p128 11M 1.4M 9.2M 13% /boot/efi 2025-08-14T21:28:22.7981880Z tmpfs 6.7G 0 6.7G 0% /run/user/0 2025-08-14T21:28:22.8013413Z Prepare all required actions 2025-08-14T21:28:22.8013963Z Getting action download info 2025-08-14T21:28:22.9725894Z ##[group]Run ./.github/actions/download-td-artifacts 2025-08-14T21:28:22.9726261Z with: 2025-08-14T21:28:22.9726466Z env: 2025-08-14T21:28:22.9726686Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:28:22.9727022Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T21:28:22.9727360Z ##[endgroup] 2025-08-14T21:28:22.9780492Z ##[group]Run seemethere/download-artifact-s3@v4 2025-08-14T21:28:22.9780995Z with: 2025-08-14T21:28:22.9781316Z name: td_results 2025-08-14T21:28:22.9781683Z s3-bucket: gha-artifacts 2025-08-14T21:28:22.9782076Z region: us-east-1 2025-08-14T21:28:22.9782426Z env: 2025-08-14T21:28:22.9782746Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:28:22.9783228Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T21:28:22.9783760Z ##[endgroup] 2025-08-14T21:28:23.4268047Z (node:62591) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2025-08-14T21:28:23.4268519Z 2025-08-14T21:28:23.4268722Z Please migrate your code to use AWS SDK for JavaScript (v3). 2025-08-14T21:28:23.4269215Z For more information, check the migration guide at https://a.co/7PzMCcy 2025-08-14T21:28:23.4269726Z (Use `node --trace-warnings ...` to show where the warning was created) 2025-08-14T21:28:23.5429276Z Found 1 objects with prefix pytorch/pytorch/16976255024/td_results/ 2025-08-14T21:28:23.5430454Z Starting download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/td_results.json 2025-08-14T21:28:23.6188087Z Finished download (1/1): /home/ec2-user/actions-runner/_work/pytorch/pytorch/td_results.json 2025-08-14T21:28:23.6193429Z Artifact download has finished successfully 2025-08-14T21:28:23.6535369Z ##[group]Run mkdir -p .additional_ci_files 2025-08-14T21:28:23.6535724Z mkdir -p .additional_ci_files 2025-08-14T21:28:23.6536144Z mv td_results.json .additional_ci_files/td_results.json || true 2025-08-14T21:28:23.6546456Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:28:23.6546820Z env: 2025-08-14T21:28:23.6547032Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:28:23.6547788Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T21:28:23.6548129Z ##[endgroup] 2025-08-14T21:28:23.6650343Z ##[group]Run .github/scripts/parse_ref.py 2025-08-14T21:28:23.6650704Z .github/scripts/parse_ref.py 2025-08-14T21:28:23.6658990Z shell: /usr/bin/bash -e {0} 2025-08-14T21:28:23.6659258Z env: 2025-08-14T21:28:23.6659466Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:28:23.6659777Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T21:28:23.6660114Z ##[endgroup] 2025-08-14T21:28:23.6905795Z Setting output branch=main 2025-08-14T21:28:23.7025085Z Prepare all required actions 2025-08-14T21:28:23.7025441Z Getting action download info 2025-08-14T21:28:23.8440713Z ##[group]Run ./.github/actions/filter-test-configs 2025-08-14T21:28:23.8441044Z with: 2025-08-14T21:28:23.8441688Z github-token: *** 2025-08-14T21:28:23.8442595Z test-matrix: {"include": [{"config": "slow", "shard": 1, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}, {"config": "slow", "shard": 2, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}, {"config": "slow", "shard": 3, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}]} 2025-08-14T21:28:23.8443777Z job-name: linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 3, 3, linux.g5.4xlarge.nvidia.gpu) 2025-08-14T21:28:23.8444271Z env: 2025-08-14T21:28:23.8444488Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:28:23.8444888Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T21:28:23.8445231Z ##[endgroup] 2025-08-14T21:28:23.8489564Z ##[group]Run nick-fields/retry@v3.0.0 2025-08-14T21:28:23.8489881Z with: 2025-08-14T21:28:23.8490081Z shell: bash 2025-08-14T21:28:23.8490307Z timeout_minutes: 10 2025-08-14T21:28:23.8490552Z max_attempts: 5 2025-08-14T21:28:23.8490782Z retry_wait_seconds: 30 2025-08-14T21:28:23.8491504Z command: set -eux # PyYAML 6.0 doesn't work with MacOS x86 anymore # This must run on Python-3.7 (AmazonLinux2) so can't use request=3.32.2 python3 -m pip install requests==2.27.1 pyyaml==6.0.2 2025-08-14T21:28:23.8492431Z polling_interval_seconds: 1 2025-08-14T21:28:23.8492712Z warning_on_retry: true 2025-08-14T21:28:23.8492965Z continue_on_error: false 2025-08-14T21:28:23.8493217Z env: 2025-08-14T21:28:23.8493433Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:28:23.8493750Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T21:28:23.8494338Z GITHUB_TOKEN: *** 2025-08-14T21:28:23.8494572Z ##[endgroup] 2025-08-14T21:28:23.9534125Z + python3 -m pip install requests==2.27.1 pyyaml==6.0.2 2025-08-14T21:28:24.1871198Z Defaulting to user installation because normal site-packages is not writeable 2025-08-14T21:28:24.3128671Z Collecting requests==2.27.1 2025-08-14T21:28:24.3319665Z Downloading requests-2.27.1-py2.py3-none-any.whl (63 kB) 2025-08-14T21:28:24.5035442Z Collecting pyyaml==6.0.2 2025-08-14T21:28:24.5073150Z Downloading PyYAML-6.0.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (737 kB) 2025-08-14T21:28:24.5304402Z Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/lib/python3.9/site-packages (from requests==2.27.1) (1.25.10) 2025-08-14T21:28:24.5314185Z Requirement already satisfied: idna<4,>=2.5 in /usr/lib/python3.9/site-packages (from requests==2.27.1) (2.10) 2025-08-14T21:28:24.5882197Z Collecting certifi>=2017.4.17 2025-08-14T21:28:24.5915577Z Downloading certifi-2025.8.3-py3-none-any.whl (161 kB) 2025-08-14T21:28:24.9904861Z Collecting charset-normalizer~=2.0.0 2025-08-14T21:28:24.9944157Z Downloading charset_normalizer-2.0.12-py3-none-any.whl (39 kB) 2025-08-14T21:28:25.0824197Z Installing collected packages: charset-normalizer, certifi, requests, pyyaml 2025-08-14T21:28:25.2039170Z Successfully installed certifi-2025.8.3 charset-normalizer-2.0.12 pyyaml-6.0.2 requests-2.27.1 2025-08-14T21:28:25.9280542Z Command completed after 1 attempt(s). 2025-08-14T21:28:25.9383293Z ##[group]Run set -x 2025-08-14T21:28:25.9383872Z set -x 2025-08-14T21:28:25.9384230Z  2025-08-14T21:28:25.9384770Z # Use relative path here as this could be checked out anywhere, not necessarily 2025-08-14T21:28:25.9385440Z # in runner workspace 2025-08-14T21:28:25.9386002Z python3 "${GITHUB_ACTION_PATH}/../../scripts/parse_ref.py" 2025-08-14T21:28:25.9401223Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:28:25.9401743Z env: 2025-08-14T21:28:25.9402061Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:28:25.9402538Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T21:28:25.9403042Z ##[endgroup] 2025-08-14T21:28:25.9440995Z + python3 /home/ec2-user/actions-runner/_work/pytorch/pytorch/./.github/actions/filter-test-configs/../../scripts/parse_ref.py 2025-08-14T21:28:25.9624466Z Setting output branch=main 2025-08-14T21:28:25.9688076Z ##[group]Run echo "Workflow: ${GITHUB_WORKFLOW}" 2025-08-14T21:28:25.9688460Z echo "Workflow: ${GITHUB_WORKFLOW}" 2025-08-14T21:28:25.9688802Z echo "Job name: ${JOB_NAME}" 2025-08-14T21:28:25.9689075Z  2025-08-14T21:28:25.9689428Z # Use relative path here as this could be checked out anywhere, not necessarily 2025-08-14T21:28:25.9689866Z # in runner workspace 2025-08-14T21:28:25.9690265Z python3 "${GITHUB_ACTION_PATH}/../../scripts/filter_test_configs.py" \ 2025-08-14T21:28:25.9690710Z  --workflow "${GITHUB_WORKFLOW}" \ 2025-08-14T21:28:25.9691021Z  --job-name "${JOB_NAME}" \ 2025-08-14T21:28:25.9692009Z  --test-matrix "{"include": [{"config": "slow", "shard": 1, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}, {"config": "slow", "shard": 2, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}, {"config": "slow", "shard": 3, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}]}" \ 2025-08-14T21:28:25.9693009Z  --selected-test-configs "" \ 2025-08-14T21:28:25.9693330Z  --pr-number "${PR_NUMBER}" \ 2025-08-14T21:28:25.9693803Z  --tag "${TAG}" \ 2025-08-14T21:28:25.9694084Z  --event-name "${EVENT_NAME}" \ 2025-08-14T21:28:25.9694389Z  --schedule "${SCHEDULE}" \ 2025-08-14T21:28:25.9694685Z  --branch "${HEAD_BRANCH}" 2025-08-14T21:28:25.9703645Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:28:25.9704000Z env: 2025-08-14T21:28:25.9704223Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:28:25.9704591Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T21:28:25.9705271Z GITHUB_TOKEN: *** 2025-08-14T21:28:25.9705741Z JOB_NAME: linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 3, 3, linux.g5.4xlarge.nvidia.gpu) 2025-08-14T21:28:25.9706280Z PR_NUMBER: 2025-08-14T21:28:25.9706495Z TAG: 2025-08-14T21:28:25.9706704Z EVENT_NAME: push 2025-08-14T21:28:25.9706935Z SCHEDULE: 2025-08-14T21:28:25.9707150Z HEAD_BRANCH: main 2025-08-14T21:28:25.9707379Z ##[endgroup] 2025-08-14T21:28:25.9738554Z Workflow: slow 2025-08-14T21:28:25.9739030Z Job name: linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 3, 3, linux.g5.4xlarge.nvidia.gpu) 2025-08-14T21:28:26.1344129Z Setting output keep-going=True 2025-08-14T21:28:26.1344462Z Setting output ci-verbose-test-logs=False 2025-08-14T21:28:26.1344807Z Setting output ci-test-showlocals=False 2025-08-14T21:28:26.1345125Z Setting output ci-no-test-timeout=False 2025-08-14T21:28:26.1345439Z Setting output ci-no-td=False 2025-08-14T21:28:26.1345725Z Setting output ci-td-distributed=False 2025-08-14T21:28:26.1346040Z Setting output is-unstable=False 2025-08-14T21:28:26.1346337Z Setting output reenabled-issues= 2025-08-14T21:28:26.1347563Z Setting output test-matrix={"include": [{"config": "slow", "shard": 1, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}, {"config": "slow", "shard": 2, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}, {"config": "slow", "shard": 3, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}]} 2025-08-14T21:28:26.1348603Z Setting output is-test-matrix-empty=False 2025-08-14T21:28:26.1479092Z ##[group]Run echo "Filtered matrix:" 2025-08-14T21:28:26.1479422Z echo "Filtered matrix:" 2025-08-14T21:28:26.1480359Z echo "{"include": [{"config": "slow", "shard": 1, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}, {"config": "slow", "shard": 2, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}, {"config": "slow", "shard": 3, "num_shards": 3, "runner": "linux.g5.4xlarge.nvidia.gpu"}]}" 2025-08-14T21:28:26.1481294Z  2025-08-14T21:28:26.1481504Z echo 2025-08-14T21:28:26.1481774Z echo "Is the current job unstable? False" 2025-08-14T21:28:26.1482089Z  2025-08-14T21:28:26.1482495Z echo 2025-08-14T21:28:26.1482749Z echo "Is keep-going label set? True" 2025-08-14T21:28:26.1483049Z  2025-08-14T21:28:26.1483247Z echo 2025-08-14T21:28:26.1483483Z echo "Reenabled issues? " 2025-08-14T21:28:26.1493385Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:28:26.1493791Z env: 2025-08-14T21:28:26.1494016Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:28:26.1494360Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T21:28:26.1494696Z ##[endgroup] 2025-08-14T21:28:26.1525144Z Filtered matrix: 2025-08-14T21:28:26.1526010Z {include: [{config: slow, shard: 1, num_shards: 3, runner: linux.g5.4xlarge.nvidia.gpu}, {config: slow, shard: 2, num_shards: 3, runner: linux.g5.4xlarge.nvidia.gpu}, {config: slow, shard: 3, num_shards: 3, runner: linux.g5.4xlarge.nvidia.gpu}]} 2025-08-14T21:28:26.1526804Z 2025-08-14T21:28:26.1526916Z Is the current job unstable? False 2025-08-14T21:28:26.1527110Z 2025-08-14T21:28:26.1527477Z Is keep-going label set? True 2025-08-14T21:28:26.1527669Z 2025-08-14T21:28:26.1527761Z Reenabled issues? 2025-08-14T21:28:26.1574663Z ##[group]Run echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2025-08-14T21:28:26.1575162Z echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2025-08-14T21:28:26.1583717Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:28:26.1584074Z env: 2025-08-14T21:28:26.1584290Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:28:26.1584603Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T21:28:26.1584943Z JOB_TIMEOUT: 240 2025-08-14T21:28:26.1585168Z ##[endgroup] 2025-08-14T21:28:26.1647875Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-08-14T21:28:26.1648373Z env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-08-14T21:28:26.1648794Z env | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-08-14T21:28:26.1656965Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T21:28:26.1657319Z env: 2025-08-14T21:28:26.1657545Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:28:26.1657861Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T21:28:26.1658202Z ##[endgroup] 2025-08-14T21:28:26.1791577Z ##[group]Run set -x 2025-08-14T21:28:26.1791909Z set -x 2025-08-14T21:28:26.1792117Z  2025-08-14T21:28:26.1792360Z if [[ $TEST_CONFIG == 'multigpu' ]]; then 2025-08-14T21:28:26.1792732Z  TEST_COMMAND=.ci/pytorch/multigpu-test.sh 2025-08-14T21:28:26.1793109Z elif [[ $BUILD_ENVIRONMENT == *onnx* ]]; then 2025-08-14T21:28:26.1793444Z  TEST_COMMAND=.ci/onnx/test.sh 2025-08-14T21:28:26.1793727Z else 2025-08-14T21:28:26.1793973Z  TEST_COMMAND=.ci/pytorch/test.sh 2025-08-14T21:28:26.1794257Z fi 2025-08-14T21:28:26.1794455Z  2025-08-14T21:28:26.1794711Z # Leaving 1GB for the runner and other things 2025-08-14T21:28:26.1795291Z TOTAL_AVAILABLE_MEMORY_IN_GB=$(awk '/MemTotal/ { printf "%.3f \n", $2/1024/1024 - 1 }' /proc/meminfo) 2025-08-14T21:28:26.1796077Z # https://docs.docker.com/engine/containers/resource_constraints/#--memory-swap-details, the 3GB swap 2025-08-14T21:28:26.1796724Z # comes from https://github.com/pytorch/test-infra/pull/6058 2025-08-14T21:28:26.1797221Z TOTAL_MEMORY_WITH_SWAP=$(("${TOTAL_AVAILABLE_MEMORY_IN_GB%.*}" + 3)) 2025-08-14T21:28:26.1797611Z  2025-08-14T21:28:26.1797870Z if [[ ${BUILD_ENVIRONMENT} == *"s390x"* ]]; then 2025-08-14T21:28:26.1798190Z  SHM_OPTS= 2025-08-14T21:28:26.1798428Z  JENKINS_USER= 2025-08-14T21:28:26.1798771Z  # ensure that docker container cleanly exits in 12 hours 2025-08-14T21:28:26.1799208Z  # if for some reason cleanup action doesn't stop container 2025-08-14T21:28:26.1799577Z  # when job is cancelled 2025-08-14T21:28:26.1799870Z  DOCKER_SHELL_CMD="sleep 12h" 2025-08-14T21:28:26.1800157Z else 2025-08-14T21:28:26.1800401Z  SHM_OPTS="--shm-size=${SHM_SIZE}" 2025-08-14T21:28:26.1800728Z  JENKINS_USER="--user jenkins" 2025-08-14T21:28:26.1801029Z  DOCKER_SHELL_CMD= 2025-08-14T21:28:26.1801274Z fi 2025-08-14T21:28:26.1801477Z  2025-08-14T21:28:26.1801797Z # detached container should get cleaned up by teardown_ec2_linux 2025-08-14T21:28:26.1802283Z # TODO: Stop building test binaries as part of the build phase 2025-08-14T21:28:26.1802838Z # Used for GPU_FLAG, SHM_OPTS, JENKINS_USER and DOCKER_SHELL_CMD since that doesn't play nice 2025-08-14T21:28:26.1803332Z # shellcheck disable=SC2086,SC2090 2025-08-14T21:28:26.1803650Z container_name=$(docker run \ 2025-08-14T21:28:26.1803937Z  ${GPU_FLAG:-} \ 2025-08-14T21:28:26.1804228Z  ${SCCACHE_SERVER_PORT_DOCKER_FLAG:-} \ 2025-08-14T21:28:26.1804689Z  -e BUILD_ENVIRONMENT \ 2025-08-14T21:28:26.1804963Z  -e PR_NUMBER \ 2025-08-14T21:28:26.1805223Z  -e GITHUB_ACTIONS \ 2025-08-14T21:28:26.1805497Z  -e GITHUB_REPOSITORY \ 2025-08-14T21:28:26.1805785Z  -e GITHUB_WORKFLOW \ 2025-08-14T21:28:26.1806051Z  -e GITHUB_JOB \ 2025-08-14T21:28:26.1806492Z  -e GITHUB_RUN_ID \ 2025-08-14T21:28:26.1806810Z  -e GITHUB_RUN_NUMBER \ 2025-08-14T21:28:26.1807135Z  -e GITHUB_RUN_ATTEMPT \ 2025-08-14T21:28:26.1807411Z  -e JOB_ID \ 2025-08-14T21:28:26.1807653Z  -e JOB_NAME \ 2025-08-14T21:28:26.1807893Z  -e BASE_SHA \ 2025-08-14T21:28:26.1808136Z  -e BRANCH \ 2025-08-14T21:28:26.1808373Z  -e SHA1 \ 2025-08-14T21:28:26.1808612Z  -e AWS_DEFAULT_REGION \ 2025-08-14T21:28:26.1808894Z  -e IN_WHEEL_TEST \ 2025-08-14T21:28:26.1809156Z  -e SHARD_NUMBER \ 2025-08-14T21:28:26.1809415Z  -e TEST_CONFIG \ 2025-08-14T21:28:26.1809671Z  -e NUM_TEST_SHARDS \ 2025-08-14T21:28:26.1809953Z  -e REENABLED_ISSUES \ 2025-08-14T21:28:26.1810239Z  -e CONTINUE_THROUGH_ERROR \ 2025-08-14T21:28:26.1810662Z  -e VERBOSE_TEST_LOGS \ 2025-08-14T21:28:26.1810945Z  -e TEST_SHOWLOCALS \ 2025-08-14T21:28:26.1811226Z  -e NO_TEST_TIMEOUT \ 2025-08-14T21:28:26.1811480Z  -e NO_TD \ 2025-08-14T21:28:26.1811726Z  -e TD_DISTRIBUTED \ 2025-08-14T21:28:26.1811994Z  -e PR_LABELS \ 2025-08-14T21:28:26.1812277Z  -e MAX_JOBS="$(nproc --ignore=2)" \ 2025-08-14T21:28:26.1812584Z  -e SCCACHE_BUCKET \ 2025-08-14T21:28:26.1812853Z  -e SCCACHE_REGION \ 2025-08-14T21:28:26.1813113Z  -e XLA_CUDA \ 2025-08-14T21:28:26.1813382Z  -e XLA_CLANG_CACHE_S3_BUCKET_NAME \ 2025-08-14T21:28:26.1813719Z  -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK \ 2025-08-14T21:28:26.1814065Z  -e PYTORCH_TEST_RERUN_DISABLED_TESTS \ 2025-08-14T21:28:26.1814419Z  -e SKIP_SCCACHE_INITIALIZATION=1 \ 2025-08-14T21:28:26.1814776Z  -e HUGGING_FACE_HUB_TOKEN \ 2025-08-14T21:28:26.1815092Z  -e SCRIBE_GRAPHQL_ACCESS_TOKEN \ 2025-08-14T21:28:26.1815396Z  -e DASHBOARD_TAG \ 2025-08-14T21:28:26.1815667Z  -e ARTIFACTS_FILE_SUFFIX \ 2025-08-14T21:28:26.1816008Z  --memory="${TOTAL_AVAILABLE_MEMORY_IN_GB%.*}g" \ 2025-08-14T21:28:26.1816409Z  --memory-swap="${TOTAL_MEMORY_WITH_SWAP}g" \ 2025-08-14T21:28:26.1816791Z  --env-file="/tmp/github_env_${GITHUB_RUN_ID}" \ 2025-08-14T21:28:26.1817170Z  --security-opt seccomp=unconfined \ 2025-08-14T21:28:26.1817492Z  --cap-add=SYS_PTRACE \ 2025-08-14T21:28:26.1817770Z  --ipc=host \ 2025-08-14T21:28:26.1818013Z  ${SHM_OPTS} \ 2025-08-14T21:28:26.1818258Z  --tty \ 2025-08-14T21:28:26.1818483Z  --detach \ 2025-08-14T21:28:26.1818732Z  --name="${container_name}" \ 2025-08-14T21:28:26.1819024Z  ${JENKINS_USER} \ 2025-08-14T21:28:26.1819349Z  -v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \ 2025-08-14T21:28:26.1819722Z  -w /var/lib/jenkins/workspace \ 2025-08-14T21:28:26.1820028Z  "${DOCKER_IMAGE}" \ 2025-08-14T21:28:26.1820294Z  ${DOCKER_SHELL_CMD} 2025-08-14T21:28:26.1820540Z ) 2025-08-14T21:28:26.1820817Z # Propagate download.pytorch.org IP to container 2025-08-14T21:28:26.1821431Z grep download.pytorch.org /etc/hosts | docker exec -i "${container_name}" sudo bash -c "/bin/cat >> /etc/hosts" 2025-08-14T21:28:26.1822080Z echo "DOCKER_CONTAINER_ID=${container_name}" >> "${GITHUB_ENV}" 2025-08-14T21:28:26.1822451Z  2025-08-14T21:28:26.1822708Z if [[ ${BUILD_ENVIRONMENT} == *"s390x"* ]]; then 2025-08-14T21:28:26.1823238Z  docker exec -t "${container_name}" sh -c "python3 -m pip install -r .ci/docker/requirements-ci.txt" 2025-08-14T21:28:26.1823712Z fi 2025-08-14T21:28:26.1823910Z  2025-08-14T21:28:26.1824366Z docker exec -t "${container_name}" sh -c "python3 -m pip install $(echo dist/*.whl)[opt-einsum] && ${TEST_COMMAND}" 2025-08-14T21:28:26.1833562Z shell: /usr/bin/bash -e {0} 2025-08-14T21:28:26.1833807Z env: 2025-08-14T21:28:26.1834018Z GIT_DEFAULT_BRANCH: main 2025-08-14T21:28:26.1834331Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T21:28:26.1834750Z BUILD_ENVIRONMENT: linux-jammy-cuda12.8-py3.10-gcc11-sm86 2025-08-14T21:28:26.1835103Z PR_NUMBER: 2025-08-14T21:28:26.1835339Z GITHUB_REPOSITORY: pytorch/pytorch 2025-08-14T21:28:26.1835627Z GITHUB_WORKFLOW: slow 2025-08-14T21:28:26.1835859Z GITHUB_JOB: test 2025-08-14T21:28:26.1836086Z GITHUB_RUN_ID: 16976255024 2025-08-14T21:28:26.1836345Z GITHUB_RUN_NUMBER: 16376 2025-08-14T21:28:26.1836595Z GITHUB_RUN_ATTEMPT: 1 2025-08-14T21:28:26.1836826Z JOB_ID: 48128162047 2025-08-14T21:28:26.1837274Z JOB_NAME: linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 3, 3, linux.g5.4xlarge.nvidia.gpu) 2025-08-14T21:28:26.1837752Z BRANCH: main 2025-08-14T21:28:26.1838119Z SHA1: 1fc683cf17c8c673044538d10266c00f92987be2 2025-08-14T21:28:26.1838474Z BASE_SHA: 1fc683cf17c8c673044538d10266c00f92987be2 2025-08-14T21:28:26.1838787Z TEST_CONFIG: slow 2025-08-14T21:28:26.1839009Z SHARD_NUMBER: 3 2025-08-14T21:28:26.1839237Z NUM_TEST_SHARDS: 3 2025-08-14T21:28:26.1839469Z REENABLED_ISSUES: 2025-08-14T21:28:26.1839704Z CONTINUE_THROUGH_ERROR: True 2025-08-14T21:28:26.1839976Z VERBOSE_TEST_LOGS: False 2025-08-14T21:28:26.1840232Z TEST_SHOWLOCALS: False 2025-08-14T21:28:26.1840474Z NO_TEST_TIMEOUT: False 2025-08-14T21:28:26.1840731Z NO_TD: False 2025-08-14T21:28:26.1840948Z TD_DISTRIBUTED: False 2025-08-14T21:28:26.1841255Z SCCACHE_BUCKET: ossci-compiler-cache-circleci-v2 2025-08-14T21:28:26.1841599Z SCCACHE_REGION: us-east-1 2025-08-14T21:28:26.1841849Z SHM_SIZE: 2g 2025-08-14T21:28:26.1842557Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe 2025-08-14T21:28:26.1843321Z XLA_CUDA: 2025-08-14T21:28:26.1843656Z XLA_CLANG_CACHE_S3_BUCKET_NAME: ossci-compiler-clang-cache-circleci-xla 2025-08-14T21:28:26.1844075Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK: 0 2025-08-14T21:28:26.1844522Z PYTORCH_TEST_RERUN_DISABLED_TESTS: 0 2025-08-14T21:28:26.1844806Z DASHBOARD_TAG: 2025-08-14T21:28:26.1845194Z HUGGING_FACE_HUB_TOKEN: *** 2025-08-14T21:28:26.1845625Z SCRIBE_GRAPHQL_ACCESS_TOKEN: *** 2025-08-14T21:28:26.1846095Z ARTIFACTS_FILE_SUFFIX: test-slow-3-3-linux.g5.4xlarge.nvidia.gpu_48128162047 2025-08-14T21:28:26.1846588Z ##[endgroup] 2025-08-14T21:28:26.1876897Z + [[ slow == \m\u\l\t\i\g\p\u ]] 2025-08-14T21:28:26.1877390Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *onnx* ]] 2025-08-14T21:28:26.1877880Z + TEST_COMMAND=.ci/pytorch/test.sh 2025-08-14T21:28:26.1881700Z ++ awk '/MemTotal/ { printf "%.3f \n", $2/1024/1024 - 1 }' /proc/meminfo 2025-08-14T21:28:26.1908332Z + TOTAL_AVAILABLE_MEMORY_IN_GB='61.094 ' 2025-08-14T21:28:26.1908648Z + TOTAL_MEMORY_WITH_SWAP=64 2025-08-14T21:28:26.1909007Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *\s\3\9\0\x* ]] 2025-08-14T21:28:26.1909396Z + SHM_OPTS=--shm-size=2g 2025-08-14T21:28:26.1909657Z + JENKINS_USER='--user jenkins' 2025-08-14T21:28:26.1909913Z + DOCKER_SHELL_CMD= 2025-08-14T21:28:26.1918864Z +++ nproc --ignore=2 2025-08-14T21:28:26.1950548Z ++ docker run --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all -e BUILD_ENVIRONMENT -e PR_NUMBER -e GITHUB_ACTIONS -e GITHUB_REPOSITORY -e GITHUB_WORKFLOW -e GITHUB_JOB -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RUN_ATTEMPT -e JOB_ID -e JOB_NAME -e BASE_SHA -e BRANCH -e SHA1 -e AWS_DEFAULT_REGION -e IN_WHEEL_TEST -e SHARD_NUMBER -e TEST_CONFIG -e NUM_TEST_SHARDS -e REENABLED_ISSUES -e CONTINUE_THROUGH_ERROR -e VERBOSE_TEST_LOGS -e TEST_SHOWLOCALS -e NO_TEST_TIMEOUT -e NO_TD -e TD_DISTRIBUTED -e PR_LABELS -e MAX_JOBS=14 -e SCCACHE_BUCKET -e SCCACHE_REGION -e XLA_CUDA -e XLA_CLANG_CACHE_S3_BUCKET_NAME -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK -e PYTORCH_TEST_RERUN_DISABLED_TESTS -e SKIP_SCCACHE_INITIALIZATION=1 -e HUGGING_FACE_HUB_TOKEN -e SCRIBE_GRAPHQL_ACCESS_TOKEN -e DASHBOARD_TAG -e ARTIFACTS_FILE_SUFFIX --memory=61g --memory-swap=64g --env-file=/tmp/github_env_16976255024 --security-opt seccomp=unconfined --cap-add=SYS_PTRACE --ipc=host --shm-size=2g --tty --detach --name= --user jenkins -v /home/ec2-user/actions-runner/_work/pytorch/pytorch:/var/lib/jenkins/workspace -w /var/lib/jenkins/workspace 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe 2025-08-14T21:28:35.3273222Z + container_name=04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T21:28:35.3278084Z + grep download.pytorch.org /etc/hosts 2025-08-14T21:28:35.3279287Z + docker exec -i 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c sudo bash -c '/bin/cat >> /etc/hosts' 2025-08-14T21:28:35.4363882Z + echo DOCKER_CONTAINER_ID=04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T21:28:35.4366460Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *\s\3\9\0\x* ]] 2025-08-14T21:28:35.4371144Z ++ echo dist/torch-2.9.0a0+git1fc683c-cp310-cp310-linux_x86_64.whl 2025-08-14T21:28:35.4374673Z + docker exec -t 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c sh -c 'python3 -m pip install dist/torch-2.9.0a0+git1fc683c-cp310-cp310-linux_x86_64.whl[opt-einsum] && .ci/pytorch/test.sh' 2025-08-14T21:28:35.8353524Z Processing ./dist/torch-2.9.0a0+git1fc683c-cp310-cp310-linux_x86_64.whl (from torch==2.9.0a0+git1fc683c) 2025-08-14T21:28:36.1484289Z Requirement already satisfied: filelock in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+git1fc683c->torch==2.9.0a0+git1fc683c) (3.18.0) 2025-08-14T21:28:36.1487233Z Requirement already satisfied: typing-extensions>=4.10.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+git1fc683c->torch==2.9.0a0+git1fc683c) (4.14.1) 2025-08-14T21:28:36.1492117Z Requirement already satisfied: sympy>=1.13.3 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+git1fc683c->torch==2.9.0a0+git1fc683c) (1.13.3) 2025-08-14T21:28:36.1496761Z Requirement already satisfied: networkx>=2.5.1 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+git1fc683c->torch==2.9.0a0+git1fc683c) (2.8.8) 2025-08-14T21:28:36.1500317Z Requirement already satisfied: jinja2 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+git1fc683c->torch==2.9.0a0+git1fc683c) (3.1.6) 2025-08-14T21:28:36.1504746Z Requirement already satisfied: fsspec>=0.8.5 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+git1fc683c->torch==2.9.0a0+git1fc683c) (2025.5.1) 2025-08-14T21:28:36.1518097Z Requirement already satisfied: opt-einsum>=3.3 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.9.0a0+git1fc683c->torch==2.9.0a0+git1fc683c) (3.3.0) 2025-08-14T21:28:36.1898260Z Requirement already satisfied: numpy>=1.7 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from opt-einsum>=3.3->torch==2.9.0a0+git1fc683c->torch==2.9.0a0+git1fc683c) (1.22.4) 2025-08-14T21:28:36.1916674Z Requirement already satisfied: mpmath<1.4,>=1.1.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from sympy>=1.13.3->torch==2.9.0a0+git1fc683c->torch==2.9.0a0+git1fc683c) (1.3.0) 2025-08-14T21:28:36.1973391Z Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from jinja2->torch==2.9.0a0+git1fc683c->torch==2.9.0a0+git1fc683c) (3.0.2) 2025-08-14T21:28:36.5796732Z Installing collected packages: torch 2025-08-14T21:28:47.6214183Z Successfully installed torch-2.9.0a0+git1fc683c 2025-08-14T21:28:47.6872356Z + export TERM=vt100 2025-08-14T21:28:47.6874662Z + TERM=vt100 2025-08-14T21:28:47.6874927Z ++ dirname .ci/pytorch/test.sh 2025-08-14T21:28:47.6885296Z + source .ci/pytorch/common.sh 2025-08-14T21:28:47.6889212Z +++ dirname .ci/pytorch/common.sh 2025-08-14T21:28:47.6899337Z ++ source .ci/pytorch/common_utils.sh 2025-08-14T21:28:47.6900870Z +++ declare -f -t trap_add 2025-08-14T21:28:47.6905561Z ++ set -ex -o pipefail 2025-08-14T21:28:47.6905968Z ++ [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *rocm* ]] 2025-08-14T21:28:47.6906311Z ++ BUILD_TEST_LIBTORCH=0 2025-08-14T21:28:47.6910530Z ++ dirname .ci/pytorch/test.sh 2025-08-14T21:28:47.6919942Z + source .ci/pytorch/common-build.sh 2025-08-14T21:28:47.6921833Z ++ [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 != *win-* ]] 2025-08-14T21:28:47.6929374Z ++++ dirname .ci/pytorch/common-build.sh 2025-08-14T21:28:47.6938862Z +++ cd .ci/pytorch 2025-08-14T21:28:47.6939223Z +++ pwd -P 2025-08-14T21:28:47.6942433Z ++ script_dir=/var/lib/jenkins/workspace/.ci/pytorch 2025-08-14T21:28:47.6942863Z ++ [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *-pch* ]] 2025-08-14T21:28:47.6943209Z ++ which sccache 2025-08-14T21:28:47.7031671Z ++ [[ -z ossci-compiler-cache-circleci-v2 ]] 2025-08-14T21:28:47.7032062Z ++ sccache --stop-server 2025-08-14T21:28:47.7069299Z ++ true 2025-08-14T21:28:47.7070207Z ++ rm -f /var/lib/jenkins/sccache_error.log 2025-08-14T21:28:47.7081961Z ++ trap_add sccache_epilogue EXIT 2025-08-14T21:28:47.7082281Z ++ trap_add_cmd=sccache_epilogue 2025-08-14T21:28:47.7082548Z ++ shift 2025-08-14T21:28:47.7082758Z ++ for trap_add_name in "$@" 2025-08-14T21:28:47.7090447Z ++++ trap -p EXIT 2025-08-14T21:28:47.7093195Z +++ eval 'extract_trap_cmd ' 2025-08-14T21:28:47.7093500Z ++++ extract_trap_cmd 2025-08-14T21:28:47.7093737Z ++++ printf '%s\n' '' 2025-08-14T21:28:47.7093993Z +++ printf '%s\n' sccache_epilogue 2025-08-14T21:28:47.7096850Z ++ trap -- ' 2025-08-14T21:28:47.7097151Z sccache_epilogue' EXIT 2025-08-14T21:28:47.7097386Z ++ [[ -n 1 ]] 2025-08-14T21:28:47.7097748Z ++ echo 'Skipping sccache server initialization, setting environment variables' 2025-08-14T21:28:47.7098288Z Skipping sccache server initialization, setting environment variables 2025-08-14T21:28:47.7098705Z ++ export SCCACHE_IDLE_TIMEOUT=0 2025-08-14T21:28:47.7098989Z ++ SCCACHE_IDLE_TIMEOUT=0 2025-08-14T21:28:47.7099325Z ++ export SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-08-14T21:28:47.7099754Z ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-08-14T21:28:47.7100136Z ++ export RUST_LOG=sccache::server=error 2025-08-14T21:28:47.7100438Z ++ RUST_LOG=sccache::server=error 2025-08-14T21:28:47.7100715Z ++ sccache --zero-stats 2025-08-14T21:28:48.0035014Z Statistics zeroed. 2025-08-14T21:28:48.0043486Z ++ which ccache 2025-08-14T21:28:48.0188089Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 != *rocm* ]] 2025-08-14T21:28:48.0189103Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 != *s390x* ]] 2025-08-14T21:28:48.0189824Z + [[ -d /var/lib/jenkins/workspace ]] 2025-08-14T21:28:48.0193232Z ++ stat -c %u /var/lib/jenkins/workspace 2025-08-14T21:28:48.0212720Z + WORKSPACE_ORIGINAL_OWNER_ID=1000 2025-08-14T21:28:48.0213079Z + trap_add cleanup_workspace EXIT 2025-08-14T21:28:48.0213377Z + trap_add_cmd=cleanup_workspace 2025-08-14T21:28:48.0213635Z + shift 2025-08-14T21:28:48.0213875Z + for trap_add_name in "$@" 2025-08-14T21:28:48.0221346Z +++ trap -p EXIT 2025-08-14T21:28:48.0224968Z ++ eval 'extract_trap_cmd trap -- '\'' 2025-08-14T21:28:48.0225379Z sccache_epilogue'\'' EXIT' 2025-08-14T21:28:48.0225650Z +++ extract_trap_cmd trap -- ' 2025-08-14T21:28:48.0225925Z sccache_epilogue' EXIT 2025-08-14T21:28:48.0226159Z +++ printf '%s\n' ' 2025-08-14T21:28:48.0226383Z sccache_epilogue' 2025-08-14T21:28:48.0226629Z ++ printf '%s\n' cleanup_workspace 2025-08-14T21:28:48.0229181Z + trap -- ' 2025-08-14T21:28:48.0229394Z sccache_epilogue 2025-08-14T21:28:48.0229641Z cleanup_workspace' EXIT 2025-08-14T21:28:48.0229939Z + sudo chown -R jenkins /var/lib/jenkins/workspace 2025-08-14T21:28:49.0216779Z + git config --global --add safe.directory /var/lib/jenkins/workspace 2025-08-14T21:28:49.0241297Z + echo 'Environment variables:' 2025-08-14T21:28:49.0241658Z Environment variables: 2025-08-14T21:28:49.0241975Z + env 2025-08-14T21:28:49.0254371Z GITHUB_WORKSPACE=/home/ec2-user/actions-runner/_work/pytorch/pytorch 2025-08-14T21:28:49.0254844Z CONTINUE_THROUGH_ERROR=True 2025-08-14T21:28:49.0256299Z BUILD_ENVIRONMENT=linux-jammy-cuda12.8-py3.10-gcc11-sm86 2025-08-14T21:28:49.0256664Z HOSTNAME=04bb4a0f273f 2025-08-14T21:28:49.0257219Z GITHUB_PATH=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/add_path_8e582d8b-2fc9-486f-9b1c-92ad493011ce 2025-08-14T21:28:49.0269405Z GITHUB_ACTION=__run_2 2025-08-14T21:28:49.0269694Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=0 2025-08-14T21:28:49.0269991Z GITHUB_RUN_NUMBER=16376 2025-08-14T21:28:49.0270236Z TEST_CONFIG=slow 2025-08-14T21:28:49.0270477Z GITHUB_REPOSITORY_OWNER_ID=21003710 2025-08-14T21:28:49.0270801Z TORCH_NVCC_FLAGS=-Xfatbin -compress-all 2025-08-14T21:28:49.0271103Z SCCACHE_IDLE_TIMEOUT=0 2025-08-14T21:28:49.0271564Z SCRIBE_GRAPHQL_ACCESS_TOKEN=*** 2025-08-14T21:28:49.0271857Z GITHUB_TRIGGERING_ACTOR=pytorchmergebot 2025-08-14T21:28:49.0272161Z GITHUB_REF_TYPE=branch 2025-08-14T21:28:49.0272411Z TORCH_CUDA_ARCH_LIST=Maxwell 2025-08-14T21:28:49.0272915Z BASE_SHA=1fc683cf17c8c673044538d10266c00f92987be2 2025-08-14T21:28:49.0273232Z XLA_CUDA= 2025-08-14T21:28:49.0273471Z NCCL_LIB_DIR=/usr/local/cuda/lib64/ 2025-08-14T21:28:49.0273856Z HUGGING_FACE_HUB_TOKEN=*** 2025-08-14T21:28:49.0277723Z *** 2025-08-14T21:28:49.0277947Z GITHUB_REPOSITORY_ID=65600975 2025-08-14T21:28:49.0278219Z GITHUB_ACTIONS=true 2025-08-14T21:28:49.0278451Z NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T21:28:49.0278783Z SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-08-14T21:28:49.0279151Z SHA1=1fc683cf17c8c673044538d10266c00f92987be2 2025-08-14T21:28:49.0279494Z GITHUB_SHA=1fc683cf17c8c673044538d10266c00f92987be2 2025-08-14T21:28:49.0279966Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/slow.yml@refs/heads/main 2025-08-14T21:28:49.0280395Z UCC_HOME=/usr 2025-08-14T21:28:49.0280608Z VERBOSE_TEST_LOGS=False 2025-08-14T21:28:49.0280862Z GITHUB_REF=refs/heads/main 2025-08-14T21:28:49.0281240Z SHARD_NUMBER=3 2025-08-14T21:28:49.0281461Z GITHUB_REF_PROTECTED=true 2025-08-14T21:28:49.0281726Z HOME=/var/lib/jenkins 2025-08-14T21:28:49.0281996Z GITHUB_API_URL=https://api.github.com 2025-08-14T21:28:49.0282307Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2025-08-14T21:28:49.0282636Z UCX_COMMIT=7bb2722ff2187a0cad557ae4a6afa090569f83fb 2025-08-14T21:28:49.0282956Z USE_SYSTEM_NCCL=1 2025-08-14T21:28:49.0283174Z NUM_TEST_SHARDS=3 2025-08-14T21:28:49.0283383Z UCX_HOME=/usr 2025-08-14T21:28:49.0283905Z GITHUB_STATE=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/save_state_8e582d8b-2fc9-486f-9b1c-92ad493011ce 2025-08-14T21:28:49.0284833Z JOB_NAME=linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 3, 3, linux.g5.4xlarge.nvidia.gpu) 2025-08-14T21:28:49.0285597Z GITHUB_ENV=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_env_8e582d8b-2fc9-486f-9b1c-92ad493011ce 2025-08-14T21:28:49.0286363Z GITHUB_EVENT_PATH=/home/ec2-user/actions-runner/_work/_temp/_github_workflow/event.json 2025-08-14T21:28:49.0286820Z GITHUB_EVENT_NAME=push 2025-08-14T21:28:49.0287058Z DASHBOARD_TAG= 2025-08-14T21:28:49.0287282Z GITHUB_RUN_ID=16976255024 2025-08-14T21:28:49.0287530Z INSTALLED_OPENBLAS= 2025-08-14T21:28:49.0288097Z GITHUB_STEP_SUMMARY=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/step_summary_8e582d8b-2fc9-486f-9b1c-92ad493011ce 2025-08-14T21:28:49.0288712Z GITHUB_ACTOR=pytorchmergebot 2025-08-14T21:28:49.0288972Z PR_NUMBER= 2025-08-14T21:28:49.0289181Z DESIRED_CUDA=12.8.1 2025-08-14T21:28:49.0289404Z GITHUB_RUN_ATTEMPT=1 2025-08-14T21:28:49.0289649Z ANACONDA_PYTHON_VERSION=3.10 2025-08-14T21:28:49.0289970Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2025-08-14T21:28:49.0290298Z TERM=vt100 2025-08-14T21:28:49.0290499Z INSTALLED_VISION=yes 2025-08-14T21:28:49.0290730Z BRANCH=main 2025-08-14T21:28:49.0290949Z SCCACHE_REGION=us-east-1 2025-08-14T21:28:49.0291203Z OPENSSL_ROOT_DIR=/opt/openssl 2025-08-14T21:28:49.0291474Z CUDA_PATH=/usr/local/cuda 2025-08-14T21:28:49.0291955Z GITHUB_ACTION_PATH=/home/ec2-user/actions-runner/_work/pytorch/pytorch/./.github/actions/setup-linux 2025-08-14T21:28:49.0292482Z GITHUB_SERVER_URL=https://github.com 2025-08-14T21:28:49.0292943Z UCC_COMMIT=20eae37090a4ce1b32bcce6144ccad0b49943e0b 2025-08-14T21:28:49.0293266Z REENABLED_ISSUES= 2025-08-14T21:28:49.0293474Z DOCS= 2025-08-14T21:28:49.0293659Z SHLVL=1 2025-08-14T21:28:49.0293849Z MAX_JOBS=14 2025-08-14T21:28:49.0294152Z GITHUB_ACTOR_ID=97764156 2025-08-14T21:28:49.0294483Z GITHUB_WORKFLOW_SHA=1fc683cf17c8c673044538d10266c00f92987be2 2025-08-14T21:28:49.0294833Z GITHUB_REF_NAME=main 2025-08-14T21:28:49.0295193Z XLA_CLANG_CACHE_S3_BUCKET_NAME=ossci-compiler-clang-cache-circleci-xla 2025-08-14T21:28:49.0295585Z GITHUB_JOB=test 2025-08-14T21:28:49.0295805Z NO_TEST_TIMEOUT=False 2025-08-14T21:28:49.0296040Z TD_DISTRIBUTED=False 2025-08-14T21:28:49.0296288Z GITHUB_REPOSITORY=pytorch/pytorch 2025-08-14T21:28:49.0296569Z GITHUB_RETENTION_DAYS=90 2025-08-14T21:28:49.0296820Z OPENSSL_DIR=/opt/openssl 2025-08-14T21:28:49.0297070Z GITHUB_ACTION_REPOSITORY= 2025-08-14T21:28:49.0297869Z PATH=/opt/cache/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-08-14T21:28:49.0298593Z GITHUB_BASE_REF= 2025-08-14T21:28:49.0298805Z INSTALLED_ACL= 2025-08-14T21:28:49.0299173Z ARTIFACTS_FILE_SUFFIX=test-slow-3-3-linux.g5.4xlarge.nvidia.gpu_48128162047 2025-08-14T21:28:49.0299587Z CI=true 2025-08-14T21:28:49.0299794Z GITHUB_REPOSITORY_OWNER=pytorch 2025-08-14T21:28:49.0300108Z RUST_LOG=sccache::server=error 2025-08-14T21:28:49.0300366Z JOB_ID=48128162047 2025-08-14T21:28:49.0300586Z GITHUB_HEAD_REF= 2025-08-14T21:28:49.0300804Z GITHUB_ACTION_REF= 2025-08-14T21:28:49.0301079Z SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2 2025-08-14T21:28:49.0301413Z TEST_SHOWLOCALS=False 2025-08-14T21:28:49.0301639Z GITHUB_WORKFLOW=slow 2025-08-14T21:28:49.0301882Z DEBIAN_FRONTEND=noninteractive 2025-08-14T21:28:49.0302461Z GITHUB_OUTPUT=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_output_8e582d8b-2fc9-486f-9b1c-92ad493011ce 2025-08-14T21:28:49.0303032Z NO_TD=False 2025-08-14T21:28:49.0303256Z SKIP_SCCACHE_INITIALIZATION=1 2025-08-14T21:28:49.0303555Z NCCL_INCLUDE_DIR=/usr/local/cuda/include/ 2025-08-14T21:28:49.0303839Z _=/usr/bin/env 2025-08-14T21:28:49.0304146Z ++ python -c 'import site; print(site.getsitepackages()[0])' 2025-08-14T21:28:49.0927648Z + TORCH_INSTALL_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch 2025-08-14T21:28:49.0928801Z + TORCH_BIN_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2025-08-14T21:28:49.0929865Z + TORCH_LIB_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib 2025-08-14T21:28:49.0930955Z + TORCH_TEST_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/test 2025-08-14T21:28:49.0931783Z + BUILD_DIR=build 2025-08-14T21:28:49.0932061Z + BUILD_RENAMED_DIR=build_renamed 2025-08-14T21:28:49.0932373Z + BUILD_BIN_DIR=build/bin 2025-08-14T21:28:49.0932623Z + SHARD_NUMBER=3 2025-08-14T21:28:49.0932839Z + NUM_TEST_SHARDS=3 2025-08-14T21:28:49.0933089Z + export TORCH_SERIALIZATION_DEBUG=1 2025-08-14T21:28:49.0933392Z + TORCH_SERIALIZATION_DEBUG=1 2025-08-14T21:28:49.0933661Z + export VALGRIND=ON 2025-08-14T21:28:49.0933886Z + VALGRIND=ON 2025-08-14T21:28:49.0934179Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *clang9* ]] 2025-08-14T21:28:49.0934588Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *xpu* ]] 2025-08-14T21:28:49.0934980Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *s390x* ]] 2025-08-14T21:28:49.0935315Z + [[ 0 == \1 ]] 2025-08-14T21:28:49.0935526Z + [[ True == \1 ]] 2025-08-14T21:28:49.0935807Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 != *bazel* ]] 2025-08-14T21:28:49.0936165Z ++ realpath build/custom_test_artifacts 2025-08-14T21:28:49.0947535Z + CUSTOM_TEST_ARTIFACT_BUILD_DIR=/var/lib/jenkins/workspace/build/custom_test_artifacts 2025-08-14T21:28:49.0947993Z + [[ -n '' ]] 2025-08-14T21:28:49.0948223Z + echo 'Environment variables' 2025-08-14T21:28:49.0948494Z Environment variables 2025-08-14T21:28:49.0948721Z + env 2025-08-14T21:28:49.1234440Z GITHUB_WORKSPACE=/home/ec2-user/actions-runner/_work/pytorch/pytorch 2025-08-14T21:28:49.1235370Z CONTINUE_THROUGH_ERROR=True 2025-08-14T21:28:49.1235847Z BUILD_ENVIRONMENT=linux-jammy-cuda12.8-py3.10-gcc11-sm86 2025-08-14T21:28:49.1236362Z HOSTNAME=04bb4a0f273f 2025-08-14T21:28:49.1236921Z GITHUB_PATH=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/add_path_8e582d8b-2fc9-486f-9b1c-92ad493011ce 2025-08-14T21:28:49.1237504Z GITHUB_ACTION=__run_2 2025-08-14T21:28:49.1237859Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=0 2025-08-14T21:28:49.1238174Z GITHUB_RUN_NUMBER=16376 2025-08-14T21:28:49.1238407Z TEST_CONFIG=slow 2025-08-14T21:28:49.1238648Z GITHUB_REPOSITORY_OWNER_ID=21003710 2025-08-14T21:28:49.1238959Z TORCH_NVCC_FLAGS=-Xfatbin -compress-all 2025-08-14T21:28:49.1239248Z SCCACHE_IDLE_TIMEOUT=0 2025-08-14T21:28:49.1239673Z SCRIBE_GRAPHQL_ACCESS_TOKEN=*** 2025-08-14T21:28:49.1239967Z GITHUB_TRIGGERING_ACTOR=pytorchmergebot 2025-08-14T21:28:49.1240258Z GITHUB_REF_TYPE=branch 2025-08-14T21:28:49.1240780Z TORCH_CUDA_ARCH_LIST=Maxwell 2025-08-14T21:28:49.1241109Z BASE_SHA=1fc683cf17c8c673044538d10266c00f92987be2 2025-08-14T21:28:49.1241425Z XLA_CUDA= 2025-08-14T21:28:49.1241645Z NCCL_LIB_DIR=/usr/local/cuda/lib64/ 2025-08-14T21:28:49.1242036Z HUGGING_FACE_HUB_TOKEN=*** 2025-08-14T21:28:49.1242330Z *** 2025-08-14T21:28:49.1242530Z GITHUB_REPOSITORY_ID=65600975 2025-08-14T21:28:49.1242796Z GITHUB_ACTIONS=true 2025-08-14T21:28:49.1243037Z NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T21:28:49.1243356Z SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-08-14T21:28:49.1243718Z SHA1=1fc683cf17c8c673044538d10266c00f92987be2 2025-08-14T21:28:49.1244060Z GITHUB_SHA=1fc683cf17c8c673044538d10266c00f92987be2 2025-08-14T21:28:49.1244647Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/slow.yml@refs/heads/main 2025-08-14T21:28:49.1245068Z UCC_HOME=/usr 2025-08-14T21:28:49.1245300Z TORCH_SERIALIZATION_DEBUG=1 2025-08-14T21:28:49.1245566Z VERBOSE_TEST_LOGS=False 2025-08-14T21:28:49.1245816Z GITHUB_REF=refs/heads/main 2025-08-14T21:28:49.1246064Z SHARD_NUMBER=3 2025-08-14T21:28:49.1246298Z GITHUB_REF_PROTECTED=true 2025-08-14T21:28:49.1246545Z HOME=/var/lib/jenkins 2025-08-14T21:28:49.1246907Z GITHUB_API_URL=https://api.github.com 2025-08-14T21:28:49.1247454Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2025-08-14T21:28:49.1247962Z UCX_COMMIT=7bb2722ff2187a0cad557ae4a6afa090569f83fb 2025-08-14T21:28:49.1248368Z USE_SYSTEM_NCCL=1 2025-08-14T21:28:49.1248589Z NUM_TEST_SHARDS=3 2025-08-14T21:28:49.1248874Z UCX_HOME=/usr 2025-08-14T21:28:49.1249481Z GITHUB_STATE=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/save_state_8e582d8b-2fc9-486f-9b1c-92ad493011ce 2025-08-14T21:28:49.1250386Z JOB_NAME=linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 3, 3, linux.g5.4xlarge.nvidia.gpu) 2025-08-14T21:28:49.1251189Z GITHUB_ENV=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_env_8e582d8b-2fc9-486f-9b1c-92ad493011ce 2025-08-14T21:28:49.1251929Z GITHUB_EVENT_PATH=/home/ec2-user/actions-runner/_work/_temp/_github_workflow/event.json 2025-08-14T21:28:49.1252386Z GITHUB_EVENT_NAME=push 2025-08-14T21:28:49.1252633Z DASHBOARD_TAG= 2025-08-14T21:28:49.1252860Z GITHUB_RUN_ID=16976255024 2025-08-14T21:28:49.1253108Z INSTALLED_OPENBLAS= 2025-08-14T21:28:49.1253681Z GITHUB_STEP_SUMMARY=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/step_summary_8e582d8b-2fc9-486f-9b1c-92ad493011ce 2025-08-14T21:28:49.1254307Z GITHUB_ACTOR=pytorchmergebot 2025-08-14T21:28:49.1254564Z PR_NUMBER= 2025-08-14T21:28:49.1254764Z DESIRED_CUDA=12.8.1 2025-08-14T21:28:49.1254996Z GITHUB_RUN_ATTEMPT=1 2025-08-14T21:28:49.1255226Z VALGRIND=ON 2025-08-14T21:28:49.1255445Z ANACONDA_PYTHON_VERSION=3.10 2025-08-14T21:28:49.1255769Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2025-08-14T21:28:49.1256094Z TERM=vt100 2025-08-14T21:28:49.1256297Z INSTALLED_VISION=yes 2025-08-14T21:28:49.1256531Z BRANCH=main 2025-08-14T21:28:49.1256746Z SCCACHE_REGION=us-east-1 2025-08-14T21:28:49.1257004Z OPENSSL_ROOT_DIR=/opt/openssl 2025-08-14T21:28:49.1257280Z CUDA_PATH=/usr/local/cuda 2025-08-14T21:28:49.1257977Z GITHUB_ACTION_PATH=/home/ec2-user/actions-runner/_work/pytorch/pytorch/./.github/actions/setup-linux 2025-08-14T21:28:49.1258553Z GITHUB_SERVER_URL=https://github.com 2025-08-14T21:28:49.1258902Z UCC_COMMIT=20eae37090a4ce1b32bcce6144ccad0b49943e0b 2025-08-14T21:28:49.1259231Z REENABLED_ISSUES= 2025-08-14T21:28:49.1259443Z DOCS= 2025-08-14T21:28:49.1259639Z SHLVL=1 2025-08-14T21:28:49.1259835Z MAX_JOBS=14 2025-08-14T21:28:49.1260046Z GITHUB_ACTOR_ID=97764156 2025-08-14T21:28:49.1260379Z GITHUB_WORKFLOW_SHA=1fc683cf17c8c673044538d10266c00f92987be2 2025-08-14T21:28:49.1260738Z GITHUB_REF_NAME=main 2025-08-14T21:28:49.1261103Z XLA_CLANG_CACHE_S3_BUCKET_NAME=ossci-compiler-clang-cache-circleci-xla 2025-08-14T21:28:49.1261501Z GITHUB_JOB=test 2025-08-14T21:28:49.1261727Z NO_TEST_TIMEOUT=False 2025-08-14T21:28:49.1261970Z TD_DISTRIBUTED=False 2025-08-14T21:28:49.1262225Z GITHUB_REPOSITORY=pytorch/pytorch 2025-08-14T21:28:49.1262659Z GITHUB_RETENTION_DAYS=90 2025-08-14T21:28:49.1262924Z OPENSSL_DIR=/opt/openssl 2025-08-14T21:28:49.1263192Z GITHUB_ACTION_REPOSITORY= 2025-08-14T21:28:49.1263909Z PATH=/opt/cache/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-08-14T21:28:49.1264624Z GITHUB_BASE_REF= 2025-08-14T21:28:49.1264854Z INSTALLED_ACL= 2025-08-14T21:28:49.1265228Z ARTIFACTS_FILE_SUFFIX=test-slow-3-3-linux.g5.4xlarge.nvidia.gpu_48128162047 2025-08-14T21:28:49.1265645Z CI=true 2025-08-14T21:28:49.1265864Z GITHUB_REPOSITORY_OWNER=pytorch 2025-08-14T21:28:49.1266177Z RUST_LOG=sccache::server=error 2025-08-14T21:28:49.1266449Z JOB_ID=48128162047 2025-08-14T21:28:49.1266671Z GITHUB_HEAD_REF= 2025-08-14T21:28:49.1266899Z GITHUB_ACTION_REF= 2025-08-14T21:28:49.1267189Z SCCACHE_BUCKET=ossci-compiler-cache-circleci-v2 2025-08-14T21:28:49.1267521Z TEST_SHOWLOCALS=False 2025-08-14T21:28:49.1267814Z GITHUB_WORKFLOW=slow 2025-08-14T21:28:49.1268135Z DEBIAN_FRONTEND=noninteractive 2025-08-14T21:28:49.1268712Z GITHUB_OUTPUT=/home/ec2-user/actions-runner/_work/_temp/_runner_file_commands/set_output_8e582d8b-2fc9-486f-9b1c-92ad493011ce 2025-08-14T21:28:49.1269301Z NO_TD=False 2025-08-14T21:28:49.1269535Z SKIP_SCCACHE_INITIALIZATION=1 2025-08-14T21:28:49.1269831Z NCCL_INCLUDE_DIR=/usr/local/cuda/include/ 2025-08-14T21:28:49.1270125Z _=/usr/bin/env 2025-08-14T21:28:49.1270351Z + echo 'Testing pytorch' 2025-08-14T21:28:49.1270597Z Testing pytorch 2025-08-14T21:28:49.1270839Z + export LANG=C.UTF-8 2025-08-14T21:28:49.1271072Z + LANG=C.UTF-8 2025-08-14T21:28:49.1271278Z + PR_NUMBER= 2025-08-14T21:28:49.1271499Z + [[ slow == \d\e\f\a\u\l\t ]] 2025-08-14T21:28:49.1271774Z + [[ slow == \d\i\s\t\r\i\b\u\t\e\d ]] 2025-08-14T21:28:49.1272058Z + [[ slow == \s\l\o\w ]] 2025-08-14T21:28:49.1272311Z + export PYTORCH_TEST_WITH_SLOW=1 2025-08-14T21:28:49.1272596Z + PYTORCH_TEST_WITH_SLOW=1 2025-08-14T21:28:49.1272865Z + export PYTORCH_TEST_SKIP_FAST=1 2025-08-14T21:28:49.1273148Z + PYTORCH_TEST_SKIP_FAST=1 2025-08-14T21:28:49.1273522Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *slow-gradcheck* ]] 2025-08-14T21:28:49.1273972Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *cuda* ]] 2025-08-14T21:28:49.1274343Z + export PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2025-08-14T21:28:49.1274672Z + PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2025-08-14T21:28:49.1274964Z + [[ slow == *crossref* ]] 2025-08-14T21:28:49.1275276Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *rocm* ]] 2025-08-14T21:28:49.1275664Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *xpu* ]] 2025-08-14T21:28:49.1276072Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 != *-bazel-* ]] 2025-08-14T21:28:49.1276435Z + pip_install ninja==1.10.2 2025-08-14T21:28:49.1276780Z + pip_install_pkg='python3 -m pip install --progress-bar off' 2025-08-14T21:28:49.1277217Z + python3 -m pip install --progress-bar off ninja==1.10.2 2025-08-14T21:28:49.7634225Z Collecting ninja==1.10.2 2025-08-14T21:28:49.7791772Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl.metadata (5.0 kB) 2025-08-14T21:28:49.8171805Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (108 kB) 2025-08-14T21:28:50.2246376Z Installing collected packages: ninja 2025-08-14T21:28:50.2246821Z Attempting uninstall: ninja 2025-08-14T21:28:50.2255320Z Found existing installation: ninja 1.11.1.3 2025-08-14T21:28:50.2283230Z Uninstalling ninja-1.11.1.3: 2025-08-14T21:28:50.2471490Z Successfully uninstalled ninja-1.11.1.3 2025-08-14T21:28:50.3232922Z Successfully installed ninja-1.10.2 2025-08-14T21:28:50.3834656Z + export PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-08-14T21:28:50.3836423Z + PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-08-14T21:28:50.3837329Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *aarch64* ]] 2025-08-14T21:28:50.3837771Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *asan* ]] 2025-08-14T21:28:50.3838183Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *-debug* ]] 2025-08-14T21:28:50.3838607Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 != *-bazel-* ]] 2025-08-14T21:28:50.3839361Z + echo 'We are not in debug mode: linux-jammy-cuda12.8-py3.10-gcc11-sm86. Expect the assertion to pass' 2025-08-14T21:28:50.3840285Z We are not in debug mode: linux-jammy-cuda12.8-py3.10-gcc11-sm86. Expect the assertion to pass 2025-08-14T21:28:50.3840782Z + cd test 2025-08-14T21:28:50.3841112Z + python -c 'import torch; torch._C._crash_if_debug_asserts_fail(424242)' 2025-08-14T21:28:52.0473442Z + [[ slow == \n\o\g\p\u\_\N\O\_\A\V\X\2 ]] 2025-08-14T21:28:52.0473809Z + [[ slow == \n\o\g\p\u\_\A\V\X\5\1\2 ]] 2025-08-14T21:28:52.0474143Z + [[ slow == \l\e\g\a\c\y\_\n\v\i\d\i\a\_\d\r\i\v\e\r ]] 2025-08-14T21:28:52.0477345Z + DYNAMO_BENCHMARK_FLAGS=() 2025-08-14T21:28:52.0477669Z + [[ slow == *pr_time_benchmarks* ]] 2025-08-14T21:28:52.0477976Z + [[ slow == *dynamo_eager* ]] 2025-08-14T21:28:52.0478242Z + [[ slow == *aot_eager* ]] 2025-08-14T21:28:52.0478492Z + [[ slow == *aot_inductor* ]] 2025-08-14T21:28:52.0478772Z + [[ slow == *max_autotune_inductor* ]] 2025-08-14T21:28:52.0479063Z + [[ slow == *inductor* ]] 2025-08-14T21:28:52.0479306Z + [[ slow == *dynamic* ]] 2025-08-14T21:28:52.0479547Z + [[ slow == *cpu* ]] 2025-08-14T21:28:52.0479817Z + DYNAMO_BENCHMARK_FLAGS+=(--device cuda) 2025-08-14T21:28:52.0509766Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *libtorch* ]] 2025-08-14T21:28:52.0510198Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *-bazel-* ]] 2025-08-14T21:28:52.0513308Z + cd test 2025-08-14T21:28:52.0513709Z + python -c 'import torch; print(torch.__config__.show())' 2025-08-14T21:28:53.6904911Z PyTorch built with: 2025-08-14T21:28:53.6905214Z - GCC 11.4 2025-08-14T21:28:53.6905479Z - C++ Version: 201703 2025-08-14T21:28:53.6905998Z - Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications 2025-08-14T21:28:53.6906676Z - Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d) 2025-08-14T21:28:53.6907101Z - OpenMP 201511 (a.k.a. OpenMP 4.5) 2025-08-14T21:28:53.6907421Z - LAPACK is enabled (usually provided by MKL) 2025-08-14T21:28:53.6907743Z - NNPACK is enabled 2025-08-14T21:28:53.6908003Z - CPU capability usage: AVX2 2025-08-14T21:28:53.6908266Z - CUDA Runtime 12.8 2025-08-14T21:28:53.6908602Z - NVCC architecture flags: -gencode;arch=compute_86,code=sm_86 2025-08-14T21:28:53.6908971Z - CuDNN 90.8 2025-08-14T21:28:53.6913130Z - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, COMMIT_SHA=1fc683cf17c8c673044538d10266c00f92987be2, CUDA_VERSION=12.8, CUDNN_VERSION=9.8.0, CXX_COMPILER=/opt/cache/bin/c++, CXX_FLAGS= -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=ON -DUSE_FBGEMM -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -DC10_NODEPRECATED -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -faligned-new -Werror -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, FORCE_FALLBACK_CUDA_MPI=1, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, TORCH_VERSION=2.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=ON, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=ON, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, USE_XCCL=OFF, USE_XPU=OFF, 2025-08-14T21:28:53.6917901Z 2025-08-14T21:28:54.1289140Z + cd test 2025-08-14T21:28:54.1289733Z + python -c 'import torch; print(torch.__config__.parallel_info())' 2025-08-14T21:28:55.4530741Z ATen/Parallel: 2025-08-14T21:28:55.4531293Z at::get_num_threads() : 8 2025-08-14T21:28:55.4531840Z at::get_num_interop_threads() : 16 2025-08-14T21:28:55.4532431Z OpenMP 201511 (a.k.a. OpenMP 4.5) 2025-08-14T21:28:55.4532970Z omp_get_max_threads() : 8 2025-08-14T21:28:55.4533970Z Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications 2025-08-14T21:28:55.4534515Z mkl_get_max_threads() : 8 2025-08-14T21:28:55.4534870Z Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d) 2025-08-14T21:28:55.4535292Z std::thread::hardware_concurrency() : 16 2025-08-14T21:28:55.4535597Z Environment variables: 2025-08-14T21:28:55.4535847Z OMP_NUM_THREADS : [not set] 2025-08-14T21:28:55.4536102Z MKL_NUM_THREADS : [not set] 2025-08-14T21:28:55.4536368Z ATen parallel backend: OpenMP 2025-08-14T21:28:55.4536541Z 2025-08-14T21:28:55.7811573Z + [[ slow == *numpy_2* ]] 2025-08-14T21:28:55.7811920Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *aarch64* ]] 2025-08-14T21:28:55.7812293Z + [[ slow == *backward* ]] 2025-08-14T21:28:55.7812545Z + [[ slow == *xla* ]] 2025-08-14T21:28:55.7812782Z + [[ slow == *executorch* ]] 2025-08-14T21:28:55.7813046Z + [[ slow == \j\i\t\_\l\e\g\a\c\y ]] 2025-08-14T21:28:55.7813386Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *libtorch* ]] 2025-08-14T21:28:55.7813764Z + [[ slow == distributed ]] 2025-08-14T21:28:55.7814038Z + [[ slow == *operator_benchmark* ]] 2025-08-14T21:28:55.7814329Z + [[ slow == *inductor_distributed* ]] 2025-08-14T21:28:55.7814627Z + [[ slow == *inductor-halide* ]] 2025-08-14T21:28:55.7814923Z + [[ slow == *inductor-triton-cpu* ]] 2025-08-14T21:28:55.7815228Z + [[ slow == *inductor-micro-benchmark* ]] 2025-08-14T21:28:55.7815536Z + [[ slow == *huggingface* ]] 2025-08-14T21:28:55.7815794Z + [[ slow == *timm* ]] 2025-08-14T21:28:55.7816031Z + [[ slow == cachebench ]] 2025-08-14T21:28:55.7816288Z + [[ slow == verify_cachebench ]] 2025-08-14T21:28:55.7816564Z + [[ slow == *torchbench* ]] 2025-08-14T21:28:55.7816842Z + [[ slow == *inductor_cpp_wrapper* ]] 2025-08-14T21:28:55.7817123Z + [[ slow == *inductor* ]] 2025-08-14T21:28:55.7817376Z + [[ slow == *einops* ]] 2025-08-14T21:28:55.7817630Z + [[ slow == *dynamo_wrapped* ]] 2025-08-14T21:28:55.7817950Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *rocm* ]] 2025-08-14T21:28:55.7818281Z + [[ 3 == 1 ]] 2025-08-14T21:28:55.7818491Z + [[ 3 == 2 ]] 2025-08-14T21:28:55.7818699Z + [[ 3 -gt 2 ]] 2025-08-14T21:28:55.7818918Z + install_torchvision 2025-08-14T21:28:55.7819158Z + local orig_preload 2025-08-14T21:28:55.7819378Z + local commit 2025-08-14T21:28:55.7819605Z ++ get_pinned_commit vision 2025-08-14T21:28:55.7819886Z ++ cat .github/ci_commit_pins/vision.txt 2025-08-14T21:28:55.7846823Z + commit=966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-08-14T21:28:55.7847358Z + orig_preload= 2025-08-14T21:28:55.7847583Z + '[' -n '' ']' 2025-08-14T21:28:55.7847879Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 == *cuda* ]] 2025-08-14T21:28:55.7848313Z + export FORCE_CUDA=1 2025-08-14T21:28:55.7848905Z + FORCE_CUDA=1 2025-08-14T21:28:55.7849201Z + export WITH_CUDA=1 2025-08-14T21:28:55.7849446Z + WITH_CUDA=1 2025-08-14T21:28:55.7849982Z + pip_build_and_install git+https://github.com/pytorch/vision.git@966da7e46f65d6d49df3e31214470a4fe5cc8e66 dist/vision 2025-08-14T21:28:55.7850795Z + local build_target=git+https://github.com/pytorch/vision.git@966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-08-14T21:28:55.7851316Z + local wheel_dir=dist/vision 2025-08-14T21:28:55.7851583Z + local found_whl=0 2025-08-14T21:28:55.7851828Z + for file in "${wheel_dir}"/*.whl 2025-08-14T21:28:55.7852112Z + [[ -f dist/vision/*.whl ]] 2025-08-14T21:28:55.7852365Z + '[' 0 == 0 ']' 2025-08-14T21:28:55.7853042Z + python3 -m pip wheel --no-build-isolation --no-deps --no-use-pep517 -w dist/vision git+https://github.com/pytorch/vision.git@966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-08-14T21:28:56.1157286Z Collecting git+https://github.com/pytorch/vision.git@966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-08-14T21:28:56.1166048Z Cloning https://github.com/pytorch/vision.git (to revision 966da7e46f65d6d49df3e31214470a4fe5cc8e66) to /tmp/pip-req-build-b_fdktrc 2025-08-14T21:28:56.1350769Z Running command git clone --filter=blob:none --quiet https://github.com/pytorch/vision.git /tmp/pip-req-build-b_fdktrc 2025-08-14T21:28:57.7951112Z Running command git rev-parse -q --verify 'sha^966da7e46f65d6d49df3e31214470a4fe5cc8e66' 2025-08-14T21:28:57.7989547Z Running command git fetch -q https://github.com/pytorch/vision.git 966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-08-14T21:28:57.9397233Z Running command git checkout -q 966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-08-14T21:28:58.2686437Z Resolved https://github.com/pytorch/vision.git to commit 966da7e46f65d6d49df3e31214470a4fe5cc8e66 2025-08-14T21:29:00.9312521Z Preparing metadata (setup.py) ... [?25l- \ | done 2025-08-14T21:29:00.9350696Z [?25hBuilding wheels for collected packages: torchvision 2025-08-14T21:29:00.9425364Z  DEPRECATION: Building 'torchvision' using the legacy setup.py bdist_wheel mechanism, which will be removed in a future version. pip 25.3 will enforce this behaviour change. A possible replacement is to use the standardized build interface by setting the `--use-pep517` option, (possibly combined with `--no-build-isolation`), or adding a `pyproject.toml` file to the source tree of 'torchvision'. Discussion can be found at https://github.com/pypa/pip/issues/6334 2025-08-14T21:30:26.2488860Z  Building wheel for torchvision (setup.py) ... [?25l- \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - \ done 2025-08-14T21:30:26.2522419Z [?25h Created wheel for torchvision: filename=torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl size=2178375 sha256=41be51917c50200a72023999c3b56096cae9b98becfcb9072cae9f5245b6520a 2025-08-14T21:30:26.2523986Z Stored in directory: /var/lib/jenkins/.cache/pip/wheels/9c/9d/3e/42fa2d5ac6ba44a90363f8fff0fa9e712e24d4f977637c81cb 2025-08-14T21:30:26.2562836Z Successfully built torchvision 2025-08-14T21:30:26.3672291Z + for file in "${wheel_dir}"/*.whl 2025-08-14T21:30:26.3672876Z + pip_install_whl dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl 2025-08-14T21:30:26.3673605Z + args=('dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl') 2025-08-14T21:30:26.3674088Z + local args 2025-08-14T21:30:26.3674504Z + [[ dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl == *\ * ]] 2025-08-14T21:30:26.3675206Z + for path in "${args[@]}" 2025-08-14T21:30:26.3675660Z + echo 'Installing dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl' 2025-08-14T21:30:26.3676443Z Installing dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl 2025-08-14T21:30:26.3677214Z + python3 -mpip install --no-index --no-deps dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl 2025-08-14T21:30:26.7077610Z Processing ./dist/vision/torchvision-0.22.0a0+966da7e-cp310-cp310-linux_x86_64.whl 2025-08-14T21:30:26.7177616Z Installing collected packages: torchvision 2025-08-14T21:30:27.1805865Z Successfully installed torchvision-0.22.0a0+966da7e 2025-08-14T21:30:27.2211637Z + '[' -n '' ']' 2025-08-14T21:30:27.2211936Z + test_python_shard 3 2025-08-14T21:30:27.2212172Z + [[ -z 3 ]] 2025-08-14T21:30:27.2212858Z + python test/run_test.py --exclude-jit-executor --exclude-distributed-tests --shard 3 3 --verbose --upload-artifacts-while-running 2025-08-14T21:30:30.2553645Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T21:30:30.2555383Z import pkg_resources 2025-08-14T21:30:32.3960611Z Downloading https://ossci-metrics.s3.amazonaws.com/disabled-tests-condensed.json to /var/lib/jenkins/workspace/test/.pytorch-disabled-tests.json 2025-08-14T21:30:32.4422286Z Ignoring disabled issues: [''] 2025-08-14T21:30:32.4553617Z Found test times from artifacts 2025-08-14T21:30:32.5071661Z Found test times from artifacts 2025-08-14T21:30:32.5086157Z Running all tests 2025-08-14T21:30:32.5560360Z Running parallel tests on 3 processes 2025-08-14T21:30:32.5572153Z Name: tests to run (est. time: 66.24min) 2025-08-14T21:30:32.5572454Z Serial tests (0): 2025-08-14T21:30:32.5575507Z Parallel tests (174): 2025-08-14T21:30:32.5578207Z export/test_torchbind 1/1 2025-08-14T21:30:32.5578522Z test_testing 1/1 2025-08-14T21:30:32.5578829Z inductor/test_aot_inductor 1/1 2025-08-14T21:30:32.5579228Z test_jit_fuser_te 1/1 2025-08-14T21:30:32.5579585Z test_public_bindings 1/1 2025-08-14T21:30:32.5579951Z test_linalg 1/1 2025-08-14T21:30:32.5580380Z inductor/test_torchinductor_dynamic_shapes 1/2 2025-08-14T21:30:32.5580885Z inductor/test_torchinductor_codegen_dynamic_shapes 1/2 2025-08-14T21:30:32.5581400Z inductor/test_torchinductor_codegen_dynamic_shapes 2/2 2025-08-14T21:30:32.5581939Z inductor/test_torchinductor_opinfo 2/11 2025-08-14T21:30:32.5582332Z inductor/test_torchinductor_opinfo 5/11 2025-08-14T21:30:32.5582659Z inductor/test_torchinductor_opinfo 6/11 2025-08-14T21:30:32.5582990Z inductor/test_torchinductor_opinfo 8/11 2025-08-14T21:30:32.5583299Z inductor/test_cpu_repro 1/1 2025-08-14T21:30:32.5583578Z inductor/test_cuda_repro 1/1 2025-08-14T21:30:32.5583855Z inductor/test_kernel_benchmark 1/1 2025-08-14T21:30:32.5584163Z inductor/test_cudagraph_trees 1/1 2025-08-14T21:30:32.5584457Z dynamo/test_dynamic_shapes 1/2 2025-08-14T21:30:32.5584733Z dynamo/test_dynamic_shapes 2/2 2025-08-14T21:30:32.5585011Z dynamo/test_unspec 1/1 2025-08-14T21:30:32.5585264Z dynamo/test_repros 1/1 2025-08-14T21:30:32.5585546Z inductor/test_pattern_matcher 1/1 2025-08-14T21:30:32.5585905Z inductor/test_mkldnn_pattern_matcher 1/1 2025-08-14T21:30:32.5586222Z inductor/test_extension_backend 1/1 2025-08-14T21:30:32.5586512Z inductor/test_perf 1/1 2025-08-14T21:30:32.5586768Z inductor/test_foreach 1/1 2025-08-14T21:30:32.5587032Z dynamo/test_modules 1/1 2025-08-14T21:30:32.5587293Z inductor/test_triton_wrapper 1/1 2025-08-14T21:30:32.5587604Z inductor/test_coordinate_descent_tuner 1/1 2025-08-14T21:30:32.5587929Z inductor/test_select_algorithm 1/1 2025-08-14T21:30:32.5588232Z inductor/test_dependencies 1/1 2025-08-14T21:30:32.5588520Z inductor/test_compiled_autograd 1/2 2025-08-14T21:30:32.5588820Z inductor/test_halide 1/1 2025-08-14T21:30:32.5589115Z inductor/test_triton_extension_backend 1/1 2025-08-14T21:30:32.5589427Z inductor/test_autoheuristic 1/1 2025-08-14T21:30:32.5589714Z inductor/test_mps_basic 1/1 2025-08-14T21:30:32.5589998Z dynamo/test_precompile_context 1/1 2025-08-14T21:30:32.5590322Z dynamo/test_cudagraphs_expandable_segments 1/1 2025-08-14T21:30:32.5591521Z dynamo/test_guard_manager 1/1 2025-08-14T21:30:32.5591809Z test_sparse_semi_structured 1/1 2025-08-14T21:30:32.5592089Z export/test_serialize 1/1 2025-08-14T21:30:32.5592362Z dynamo/test_structured_trace 1/1 2025-08-14T21:30:32.5592649Z export/test_verifier 1/1 2025-08-14T21:30:32.5592917Z dynamo/test_recompile_ux 1/1 2025-08-14T21:30:32.5593191Z dynamo/test_minifier 1/1 2025-08-14T21:30:32.5593512Z inductor/test_cudagraph_trees_expandable_segments 1/1 2025-08-14T21:30:32.5593881Z test_functionalization_of_rng_ops 1/1 2025-08-14T21:30:32.5594172Z dynamo/test_hooks 1/1 2025-08-14T21:30:32.5594427Z dynamo/test_generator 1/1 2025-08-14T21:30:32.5594697Z dynamo/test_reorder_logs 1/1 2025-08-14T21:30:32.5594964Z dynamo/test_subclasses 1/1 2025-08-14T21:30:32.5595234Z export/test_sparse 1/1 2025-08-14T21:30:32.5595670Z functorch/test_eager_transforms 1/1 2025-08-14T21:30:32.5595968Z export/test_draft_export 1/1 2025-08-14T21:30:32.5596251Z dynamo/test_comptime 1/1 2025-08-14T21:30:32.5596525Z dynamo/test_python_autograd 1/1 2025-08-14T21:30:32.5596800Z test_quantization 1/4 2025-08-14T21:30:32.5597054Z test_dynamic_shapes 1/1 2025-08-14T21:30:32.5597341Z higher_order_ops/test_invoke_subgraph 1/1 2025-08-14T21:30:32.5597636Z test_package 1/1 2025-08-14T21:30:32.5597882Z functorch/test_rearrange 1/1 2025-08-14T21:30:32.5598164Z functorch/test_parsing 1/1 2025-08-14T21:30:32.5598424Z test_hop_infra 1/1 2025-08-14T21:30:32.5598662Z test_comparison_utils 1/1 2025-08-14T21:30:32.5598935Z functorch/test_control_flow 1/1 2025-08-14T21:30:32.5599218Z functorch/test_ac_logging 1/1 2025-08-14T21:30:32.5599481Z test_mkl_verbose 1/1 2025-08-14T21:30:32.5599742Z test_appending_byte_serializer 1/1 2025-08-14T21:30:32.5600024Z test_autoload 1/1 2025-08-14T21:30:32.5600260Z test_proxy_tensor 1/1 2025-08-14T21:30:32.5600509Z test_mkldnn_verbose 1/1 2025-08-14T21:30:32.5600774Z test_utils_config_module 1/1 2025-08-14T21:30:32.5601044Z test_functionalization 1/1 2025-08-14T21:30:32.5601352Z test_rename_privateuse1_to_existing_device 1/1 2025-08-14T21:30:32.5601676Z test_transformers 1/1 2025-08-14T21:30:32.5601925Z test_ao_sparsity 1/1 2025-08-14T21:30:32.5602165Z test_license 1/1 2025-08-14T21:30:32.5602407Z torch_np/test_binary_ufuncs 1/1 2025-08-14T21:30:32.5602689Z torch_np/test_unary_ufuncs 1/1 2025-08-14T21:30:32.5602955Z test_foreach 1/1 2025-08-14T21:30:32.5603194Z test_expanded_weights 1/1 2025-08-14T21:30:32.5603468Z typing/test_python_operators 1/1 2025-08-14T21:30:32.5603750Z test_fx_experimental 1/1 2025-08-14T21:30:32.5604006Z test_file_check 1/1 2025-08-14T21:30:32.5604244Z torch_np/test_dtype 1/1 2025-08-14T21:30:32.5604603Z higher_order_ops/test_with_effects 1/1 2025-08-14T21:30:32.5604914Z test_native_functions 1/1 2025-08-14T21:30:32.5605180Z functorch/test_minifier 1/1 2025-08-14T21:30:32.5605442Z test_flop_counter 1/1 2025-08-14T21:30:32.5605700Z torch_np/test_nep50_examples 1/1 2025-08-14T21:30:32.5606001Z higher_order_ops/test_invoke_quant 1/1 2025-08-14T21:30:32.5606293Z test_per_overload_api 1/1 2025-08-14T21:30:32.5606561Z test_tensorexpr_pybind 1/1 2025-08-14T21:30:32.5606817Z test_pytree 1/1 2025-08-14T21:30:32.5607031Z test_fx_passes 1/1 2025-08-14T21:30:32.5607268Z test_module_tracker 1/1 2025-08-14T21:30:32.5607513Z test_typing 1/1 2025-08-14T21:30:32.5607726Z test_openmp 1/1 2025-08-14T21:30:32.5608000Z torch_np/numpy_tests/core/test_scalarinherit 1/1 2025-08-14T21:30:32.5608337Z functorch/test_ac_knapsack 1/1 2025-08-14T21:30:32.5608623Z backends/xeon/test_launch 1/1 2025-08-14T21:30:32.5608892Z test_nestedtensor 1/1 2025-08-14T21:30:32.5609145Z test_namedtensor 1/1 2025-08-14T21:30:32.5609388Z xpu/test_gemm 1/1 2025-08-14T21:30:32.5609628Z torch_np/test_ufuncs_basic 1/1 2025-08-14T21:30:32.5609986Z test_meta 1/1 2025-08-14T21:30:32.5610211Z test_utils_filelock 1/1 2025-08-14T21:30:32.5610463Z torch_np/test_random 1/1 2025-08-14T21:30:32.5610727Z profiler/test_kineto 1/1 2025-08-14T21:30:32.5611001Z test_compile_benchmark_util 1/1 2025-08-14T21:30:32.5611273Z test_jiterator 1/1 2025-08-14T21:30:32.5611503Z test_optim 1/1 2025-08-14T21:30:32.5611722Z test_decomp 1/16 2025-08-14T21:30:32.5611940Z test_decomp 5/16 2025-08-14T21:30:32.5612162Z test_decomp 6/16 2025-08-14T21:30:32.5612386Z test_decomp 10/16 2025-08-14T21:30:32.5612608Z test_decomp 14/16 2025-08-14T21:30:32.5612855Z test_decomp 15/16 2025-08-14T21:30:32.5613092Z test_binary_ufuncs 1/1 2025-08-14T21:30:32.5613339Z test_matmul_cuda 1/1 2025-08-14T21:30:32.5613566Z test_weak 1/1 2025-08-14T21:30:32.5613797Z functorch/test_logging 1/1 2025-08-14T21:30:32.5614052Z test_logging 1/1 2025-08-14T21:30:32.5614480Z test_out_dtype_op 1/1 2025-08-14T21:30:32.5614840Z test_complex 1/1 2025-08-14T21:30:32.5615219Z cpp_extensions/python_agnostic_extension/test/test_python_agnostic 1/1 2025-08-14T21:30:32.5615651Z torch_np/numpy_tests/core/test_einsum 1/1 2025-08-14T21:30:32.5615948Z test_ops_jit 1/1 2025-08-14T21:30:32.5616189Z lazy/test_step_closures 1/1 2025-08-14T21:30:32.5616456Z test_ops_gradients 1/1 2025-08-14T21:30:32.5616718Z test_fx_reinplace_pass 1/1 2025-08-14T21:30:32.5616986Z nn/test_multihead_attention 1/1 2025-08-14T21:30:32.5617406Z functorch/test_aot_joint_with_descriptors 1/1 2025-08-14T21:30:32.5617735Z test_pruning_op 1/1 2025-08-14T21:30:32.5617984Z lazy/test_functionalization 1/1 2025-08-14T21:30:32.5618268Z functorch/test_ac 1/1 2025-08-14T21:30:32.5618525Z test_ops_fwd_gradients 1/1 2025-08-14T21:30:32.5618773Z test_hub 1/1 2025-08-14T21:30:32.5619022Z test_set_default_mobile_cpu_allocator 1/1 2025-08-14T21:30:32.5619354Z test_cuda_expandable_segments 1/1 2025-08-14T21:30:32.5619634Z test_stateless 1/1 2025-08-14T21:30:32.5619884Z test_numpy_interop 1/1 2025-08-14T21:30:32.5620151Z profiler/test_record_function 1/1 2025-08-14T21:30:32.5620435Z test_type_hints 1/1 2025-08-14T21:30:32.5620672Z nn/test_lazy_modules 1/1 2025-08-14T21:30:32.5620937Z test_segment_reductions 1/1 2025-08-14T21:30:32.5621203Z test_import_stats 1/1 2025-08-14T21:30:32.5621442Z test_masked 1/1 2025-08-14T21:30:32.5621667Z test_cuda_sanitizer 1/1 2025-08-14T21:30:32.5621920Z test_monitor 1/1 2025-08-14T21:30:32.5622162Z profiler/test_cpp_thread 1/1 2025-08-14T21:30:32.5622426Z test_subclass 1/1 2025-08-14T21:30:32.5622667Z profiler/test_profiler 1/1 2025-08-14T21:30:32.5622943Z functorch/test_vmap_registrations 1/1 2025-08-14T21:30:32.5623248Z nn/test_parametrization 1/1 2025-08-14T21:30:32.5623515Z nn/test_pruning 1/1 2025-08-14T21:30:32.5623747Z functorch/test_dims 1/1 2025-08-14T21:30:32.5623991Z test_itt 1/1 2025-08-14T21:30:32.5624248Z torch_np/numpy_tests/core/test_multiarray 1/2 2025-08-14T21:30:32.5624573Z functorch/test_aotdispatch 1/1 2025-08-14T21:30:32.5624850Z test_schema_check 1/1 2025-08-14T21:30:32.5625093Z test_datapipe 1/1 2025-08-14T21:30:32.5625323Z test_bundled_inputs 1/1 2025-08-14T21:30:32.5625589Z profiler/test_execution_trace 1/1 2025-08-14T21:30:32.5625900Z lazy/test_generator 1/1 2025-08-14T21:30:32.5626208Z torch_np/numpy_tests/core/test_indexing 1/1 2025-08-14T21:30:32.5626510Z test_maskedtensor 1/1 2025-08-14T21:30:32.5626760Z test_tensorboard 1/1 2025-08-14T21:30:32.5627040Z torch_np/numpy_tests/lib/test_type_check 1/1 2025-08-14T21:30:32.5627341Z lazy/test_ts_opinfo 1/1 2025-08-14T21:30:32.5627596Z test_mkldnn_fusion 1/1 2025-08-14T21:30:32.5627869Z benchmark_utils/test_benchmark_utils 1/1 2025-08-14T21:30:32.5628167Z optim/test_swa_utils 1/1 2025-08-14T21:30:32.5628419Z test_sparse_csr 2/2 2025-08-14T21:30:32.5628672Z Name: excluded (est. time: 0.0min) 2025-08-14T21:30:32.5629035Z Serial tests (0): 2025-08-14T21:30:32.5629262Z Parallel tests (0): 2025-08-14T21:30:32.5629595Z Running export/test_torchbind 1/1 ... [2025-08-14 21:30:32.561216] 2025-08-14T21:30:32.5629979Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:30:32.5630935Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_torchbind.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:30:32.561523] 2025-08-14T21:30:40.9905795Z 2025-08-14T21:30:40.9907026Z export/test_torchbind 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_torchbind_1.1_3403cc13c6c437a4_.log 2025-08-14T21:30:40.9908125Z Running 0 items in this shard: 2025-08-14T21:30:40.9908384Z 2025-08-14T21:30:41.3050973Z Uploading artifacts took 0.31 seconds 2025-08-14T21:30:41.3056380Z Running test_testing 1/1 ... [2025-08-14 21:30:41.305217] 2025-08-14T21:30:41.3056844Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:30:41.3059670Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_testing.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:30:41.305588] 2025-08-14T21:30:47.0809868Z 2025-08-14T21:30:47.0810788Z test_testing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_testing_1.1_428bac4584b126f4_.log 2025-08-14T21:30:47.0811402Z Running 0 items in this shard: 2025-08-14T21:30:47.0811582Z 2025-08-14T21:30:47.0814251Z Running inductor/test_aot_inductor 1/1 ... [2025-08-14 21:30:47.081058] 2025-08-14T21:30:47.0814657Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:30:47.0817838Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:30:47.081407] 2025-08-14T21:30:54.6590264Z 2025-08-14T21:30:54.6591283Z inductor/test_aot_inductor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_1.1_49cc9f651ce74b84_.log 2025-08-14T21:30:54.6592020Z Running 0 items in this shard: 2025-08-14T21:30:54.6592205Z 2025-08-14T21:30:54.6592982Z Running test_jit_fuser_te 1/1 ... [2025-08-14 21:30:54.659047] 2025-08-14T21:30:54.6593362Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:30:54.6597833Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_jit_fuser_te.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:30:54.659376] 2025-08-14T21:31:02.2377231Z 2025-08-14T21:31:02.2377997Z test_jit_fuser_te 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_jit_fuser_te_1.1_aa945fb3085b14d9_.log 2025-08-14T21:31:02.2378659Z Running 0 items in this shard: 2025-08-14T21:31:02.2380734Z 2025-08-14T21:31:02.2380936Z Running test_public_bindings 1/1 ... [2025-08-14 21:31:02.237807] 2025-08-14T21:31:02.2381333Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:31:02.2385078Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_public_bindings.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:31:02.238171] 2025-08-14T21:31:06.1098546Z 2025-08-14T21:31:06.1099652Z test_public_bindings 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_public_bindings_1.1_db83b09c1c31bc62_.log 2025-08-14T21:31:06.1100450Z Running 0 items in this shard: 2025-08-14T21:31:06.1100664Z 2025-08-14T21:31:06.1100835Z Running test_linalg 1/1 ... [2025-08-14 21:31:06.109703] 2025-08-14T21:31:06.1101520Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:31:06.1103829Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_linalg.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:31:06.110049] 2025-08-14T21:31:11.2843287Z 2025-08-14T21:31:11.2843961Z test_linalg 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_linalg_1.1_a9f2a3ed094707dd_.log 2025-08-14T21:31:11.2846968Z Running 8 items in this shard: test/test_linalg.py::TestLinalgCUDA::test_svd_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_svd_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_svd_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_svd_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_svd_memory_allocation_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_svd_memory_allocation_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_svd_memory_allocation_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_svd_memory_allocation_cuda_float64 2025-08-14T21:31:11.2849429Z 2025-08-14T21:31:11.2849703Z Running inductor/test_torchinductor_dynamic_shapes 1/2 ... [2025-08-14 21:31:11.284473] 2025-08-14T21:31:11.2850283Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:31:11.2851967Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_dynamic_shapes.py', '-m', 'serial', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:31:11.284884] 2025-08-14T21:31:18.9621281Z 2025-08-14T21:31:18.9622364Z inductor/test_torchinductor_dynamic_shapes 1/2 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_dynamic_shapes_1.2_e65a746773479ef9_.log 2025-08-14T21:31:18.9627137Z Running 2 items in this shard: test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_large_block_sizes_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_large_block_sizes_dynamic_shapes_cuda 2025-08-14T21:31:18.9628272Z 2025-08-14T21:31:18.9628570Z Running inductor/test_torchinductor_codegen_dynamic_shapes 1/2 ... [2025-08-14 21:31:18.962296] 2025-08-14T21:31:18.9629073Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:31:18.9630283Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_codegen_dynamic_shapes.py', '-m', 'serial', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:31:18.962699] 2025-08-14T21:31:26.7411390Z 2025-08-14T21:31:26.7412533Z inductor/test_torchinductor_codegen_dynamic_shapes 1/2 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_codegen_dynamic_shapes_1.2_29e13ac9e09d87bf_.log 2025-08-14T21:31:26.7413473Z Running 0 items in this shard: 2025-08-14T21:31:26.7413654Z 2025-08-14T21:31:26.7415555Z Running inductor/test_torchinductor_codegen_dynamic_shapes 2/2 ... [2025-08-14 21:31:26.741241] 2025-08-14T21:31:26.7416071Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:31:26.7419530Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_codegen_dynamic_shapes.py', '-m', 'serial', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:31:26.741607] 2025-08-14T21:31:34.5193806Z 2025-08-14T21:31:34.5195452Z inductor/test_torchinductor_codegen_dynamic_shapes 2/2 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_codegen_dynamic_shapes_2.2_1c2b59265d1c83de_.log 2025-08-14T21:31:34.5198655Z Running 2 items in this shard: test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_large_block_sizes_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_large_block_sizes_dynamic_shapes_cuda 2025-08-14T21:31:34.5200919Z 2025-08-14T21:31:34.5201346Z Running inductor/test_torchinductor_opinfo 2/11 ... [2025-08-14 21:31:34.519485] 2025-08-14T21:31:34.5202078Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:31:34.5203735Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'serial', '--shard-id=2', '--num-shards=11', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:31:34.519835] 2025-08-14T21:31:43.7510522Z 2025-08-14T21:31:43.7511798Z inductor/test_torchinductor_opinfo 2/11 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_2.11_cd00328892db7f00_.log 2025-08-14T21:31:43.7512877Z Running 0 items in this shard: 2025-08-14T21:31:43.7513124Z 2025-08-14T21:31:43.7514020Z Running inductor/test_torchinductor_opinfo 5/11 ... [2025-08-14 21:31:43.751131] 2025-08-14T21:31:43.7514458Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:31:43.7518090Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'serial', '--shard-id=5', '--num-shards=11', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:31:43.751495] 2025-08-14T21:31:52.9832465Z 2025-08-14T21:31:52.9833438Z inductor/test_torchinductor_opinfo 5/11 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_5.11_3945404b3985866f_.log 2025-08-14T21:31:52.9834249Z Running 0 items in this shard: 2025-08-14T21:31:52.9834471Z 2025-08-14T21:31:52.9836652Z Running inductor/test_torchinductor_opinfo 6/11 ... [2025-08-14 21:31:52.983376] 2025-08-14T21:31:52.9837312Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:31:52.9840592Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'serial', '--shard-id=6', '--num-shards=11', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:31:52.983724] 2025-08-14T21:32:02.1639514Z 2025-08-14T21:32:02.1640598Z inductor/test_torchinductor_opinfo 6/11 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_6.11_b0df501d68adc34e_.log 2025-08-14T21:32:02.1641430Z Running 0 items in this shard: 2025-08-14T21:32:02.1641635Z 2025-08-14T21:32:02.1643108Z Running inductor/test_torchinductor_opinfo 8/11 ... [2025-08-14 21:32:02.164043] 2025-08-14T21:32:02.1643714Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:32:02.1647587Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'serial', '--shard-id=8', '--num-shards=11', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:32:02.164377] 2025-08-14T21:32:11.3451956Z 2025-08-14T21:32:11.3453049Z inductor/test_torchinductor_opinfo 8/11 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_8.11_82e4b6f8d8de8356_.log 2025-08-14T21:32:11.3453868Z Running 0 items in this shard: 2025-08-14T21:32:11.3454053Z 2025-08-14T21:32:11.3455373Z Running inductor/test_cpu_repro 1/1 ... [2025-08-14 21:32:11.345262] 2025-08-14T21:32:11.3455778Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:32:11.3458987Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cpu_repro.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:32:11.345589] 2025-08-14T21:32:18.7729384Z 2025-08-14T21:32:18.7730166Z inductor/test_cpu_repro 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cpu_repro_1.1_d9d8aae1e6c61cd7_.log 2025-08-14T21:32:18.7730878Z Running 0 items in this shard: 2025-08-14T21:32:18.7731067Z 2025-08-14T21:32:18.7733336Z Running inductor/test_cuda_repro 1/1 ... [2025-08-14 21:32:18.773039] 2025-08-14T21:32:18.7733752Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:32:18.7737021Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cuda_repro.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:32:18.773371] 2025-08-14T21:32:26.1512050Z 2025-08-14T21:32:26.1513908Z inductor/test_cuda_repro 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cuda_repro_1.1_55d9a103999b0117_.log 2025-08-14T21:32:26.1525810Z Running 0 items in this shard: 2025-08-14T21:32:26.1526049Z 2025-08-14T21:32:26.1526447Z Running inductor/test_kernel_benchmark 1/1 ... [2025-08-14 21:32:26.151328] 2025-08-14T21:32:26.1527044Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:32:26.1528224Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_kernel_benchmark.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:32:26.151658] 2025-08-14T21:32:32.7777739Z 2025-08-14T21:32:32.7778692Z inductor/test_kernel_benchmark 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_kernel_benchmark_1.1_df2d6f139dcc92f4_.log 2025-08-14T21:32:32.7779626Z Running 0 items in this shard: 2025-08-14T21:32:32.7779859Z 2025-08-14T21:32:32.7781700Z Running inductor/test_cudagraph_trees 1/1 ... [2025-08-14 21:32:32.777869] 2025-08-14T21:32:32.7782151Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:32:32.7785465Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cudagraph_trees.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:32:32.778201] 2025-08-14T21:32:39.4539119Z 2025-08-14T21:32:39.4540095Z inductor/test_cudagraph_trees 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cudagraph_trees_1.1_a3fe4b5b8ef2b54b_.log 2025-08-14T21:32:39.4540989Z Running 0 items in this shard: 2025-08-14T21:32:39.4541223Z 2025-08-14T21:32:39.4543133Z Running dynamo/test_dynamic_shapes 1/2 ... [2025-08-14 21:32:39.454013] 2025-08-14T21:32:39.4543549Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:32:39.4546823Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_dynamic_shapes.py', '-m', 'serial', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:32:39.454346] 2025-08-14T21:32:48.2340340Z 2025-08-14T21:32:48.2341214Z dynamo/test_dynamic_shapes 1/2 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_dynamic_shapes_1.2_6affaaed98f775d6_.log 2025-08-14T21:32:48.2342636Z Running 2 items in this shard: test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_dont_dce_rand_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_mem_leak_guards_dynamic_shapes 2025-08-14T21:32:48.2343536Z 2025-08-14T21:32:48.2345510Z Running dynamo/test_dynamic_shapes 2/2 ... [2025-08-14 21:32:48.234133] 2025-08-14T21:32:48.2346069Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:32:48.2349803Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_dynamic_shapes.py', '-m', 'serial', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:32:48.234468] 2025-08-14T21:32:56.7135779Z 2025-08-14T21:32:56.7136778Z dynamo/test_dynamic_shapes 2/2 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_dynamic_shapes_2.2_489e15cdbe712c04_.log 2025-08-14T21:32:56.7137505Z Running 0 items in this shard: 2025-08-14T21:32:56.7137752Z 2025-08-14T21:32:56.7140482Z Running dynamo/test_unspec 1/1 ... [2025-08-14 21:32:56.713725] 2025-08-14T21:32:56.7140877Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:32:56.7144766Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_unspec.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:32:56.714093] 2025-08-14T21:33:01.2871865Z 2025-08-14T21:33:01.2872645Z dynamo/test_unspec 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_unspec_1.1_77c4b5228e9df64e_.log 2025-08-14T21:33:01.2873315Z Running 0 items in this shard: 2025-08-14T21:33:01.2873527Z 2025-08-14T21:33:01.2876106Z Running dynamo/test_repros 1/1 ... [2025-08-14 21:33:01.287314] 2025-08-14T21:33:01.2876512Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:33:01.2879863Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_repros.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:33:01.287676] 2025-08-14T21:33:05.9607553Z 2025-08-14T21:33:05.9608343Z dynamo/test_repros 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_repros_1.1_6ad3aa68bd0162ed_.log 2025-08-14T21:33:05.9609433Z Running 2 items in this shard: test/dynamo/test_repros.py::ReproTests::test_dont_dce_rand, test/dynamo/test_repros.py::ReproTests::test_mem_leak_guards 2025-08-14T21:33:05.9610004Z 2025-08-14T21:33:05.9610916Z Running inductor/test_pattern_matcher 1/1 ... [2025-08-14 21:33:05.960834] 2025-08-14T21:33:05.9611329Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:33:05.9615704Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_pattern_matcher.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:33:05.961196] 2025-08-14T21:33:12.6372735Z 2025-08-14T21:33:12.6373664Z inductor/test_pattern_matcher 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_pattern_matcher_1.1_a575b490b2df0e9d_.log 2025-08-14T21:33:12.6374432Z Running 0 items in this shard: 2025-08-14T21:33:12.6374642Z 2025-08-14T21:33:12.6377016Z Running inductor/test_mkldnn_pattern_matcher 1/1 ... [2025-08-14 21:33:12.637391] 2025-08-14T21:33:12.6377475Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:33:12.6381326Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_mkldnn_pattern_matcher.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:33:12.637768] 2025-08-14T21:33:19.6654153Z 2025-08-14T21:33:19.6655081Z inductor/test_mkldnn_pattern_matcher 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_mkldnn_pattern_matcher_1.1_a4228dd5cc3029fb_.log 2025-08-14T21:33:19.6655886Z Running 0 items in this shard: 2025-08-14T21:33:19.6656065Z 2025-08-14T21:33:19.6658393Z Running inductor/test_extension_backend 1/1 ... [2025-08-14 21:33:19.665541] 2025-08-14T21:33:19.6658833Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:33:19.6662525Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_extension_backend.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:33:19.665914] 2025-08-14T21:33:26.8934461Z 2025-08-14T21:33:26.8935360Z inductor/test_extension_backend 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_extension_backend_1.1_2dbcc54c737b0136_.log 2025-08-14T21:33:26.8936138Z Running 0 items in this shard: 2025-08-14T21:33:26.8936326Z 2025-08-14T21:33:26.8938330Z Running inductor/test_perf 1/1 ... [2025-08-14 21:33:26.893486] 2025-08-14T21:33:26.8938719Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:33:26.8942360Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_perf.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:33:26.893836] 2025-08-14T21:33:33.6202693Z 2025-08-14T21:33:33.6203644Z inductor/test_perf 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_perf_1.1_f500618ae9486d87_.log 2025-08-14T21:33:33.6204312Z Running 0 items in this shard: 2025-08-14T21:33:33.6204543Z 2025-08-14T21:33:33.6206456Z Running inductor/test_foreach 1/1 ... [2025-08-14 21:33:33.620319] 2025-08-14T21:33:33.6206862Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:33:33.6210592Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_foreach.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:33:33.620707] 2025-08-14T21:33:41.0483875Z 2025-08-14T21:33:41.0484815Z inductor/test_foreach 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_foreach_1.1_a8f0558397ce3279_.log 2025-08-14T21:33:41.0485537Z Running 0 items in this shard: 2025-08-14T21:33:41.0485716Z 2025-08-14T21:33:41.0486835Z Running dynamo/test_modules 1/1 ... [2025-08-14 21:33:41.048379] 2025-08-14T21:33:41.0487227Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:33:41.0490554Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_modules.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:33:41.048742] 2025-08-14T21:33:48.1261909Z 2025-08-14T21:33:48.1262805Z dynamo/test_modules 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_modules_1.1_836ba8c61d988e49_.log 2025-08-14T21:33:48.1263594Z Running 0 items in this shard: 2025-08-14T21:33:48.1263810Z 2025-08-14T21:33:48.1265756Z Running inductor/test_triton_wrapper 1/1 ... [2025-08-14 21:33:48.126274] 2025-08-14T21:33:48.1266187Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:33:48.1270130Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_triton_wrapper.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:33:48.126610] 2025-08-14T21:33:54.9039563Z 2025-08-14T21:33:54.9041206Z inductor/test_triton_wrapper 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_triton_wrapper_1.1_57001427517ef3fd_.log 2025-08-14T21:33:54.9042028Z Running 0 items in this shard: 2025-08-14T21:33:54.9042217Z 2025-08-14T21:33:54.9042475Z Running inductor/test_coordinate_descent_tuner 1/1 ... [2025-08-14 21:33:54.903875] 2025-08-14T21:33:54.9042976Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:33:54.9046830Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_coordinate_descent_tuner.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:33:54.904238] 2025-08-14T21:34:01.5312690Z 2025-08-14T21:34:01.5313707Z inductor/test_coordinate_descent_tuner 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_coordinate_descent_tuner_1.1_5dbc4fcc6d8ccd5c_.log 2025-08-14T21:34:01.5314543Z Running 0 items in this shard: 2025-08-14T21:34:01.5314731Z 2025-08-14T21:34:01.5317993Z Running inductor/test_select_algorithm 1/1 ... [2025-08-14 21:34:01.531490] 2025-08-14T21:34:01.5318432Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:34:01.5323043Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_select_algorithm.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:34:01.531843] 2025-08-14T21:34:08.3081746Z 2025-08-14T21:34:08.3083892Z inductor/test_select_algorithm 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_select_algorithm_1.1_26088e8d5723f721_.log 2025-08-14T21:34:08.3085443Z Running 0 items in this shard: 2025-08-14T21:34:08.3085684Z 2025-08-14T21:34:08.3085981Z Running inductor/test_dependencies 1/1 ... [2025-08-14 21:34:08.308228] 2025-08-14T21:34:08.3086450Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:34:08.3090006Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_dependencies.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:34:08.308621] 2025-08-14T21:34:15.0852316Z 2025-08-14T21:34:15.0853660Z inductor/test_dependencies 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_dependencies_1.1_eedb6024cec1129b_.log 2025-08-14T21:34:15.0854411Z Running 0 items in this shard: 2025-08-14T21:34:15.0854617Z 2025-08-14T21:34:15.0856350Z Running inductor/test_compiled_autograd 1/2 ... [2025-08-14 21:34:15.085291] 2025-08-14T21:34:15.0856959Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:34:15.0860307Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_compiled_autograd.py', '-m', 'serial', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:34:15.085662] 2025-08-14T21:34:23.6649981Z 2025-08-14T21:34:23.6650927Z inductor/test_compiled_autograd 1/2 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_compiled_autograd_1.2_ff307f387c036eaa_.log 2025-08-14T21:34:23.6651783Z Running 0 items in this shard: 2025-08-14T21:34:23.6652021Z 2025-08-14T21:34:23.6652937Z Running inductor/test_halide 1/1 ... [2025-08-14 21:34:23.665015] 2025-08-14T21:34:23.6653337Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:34:23.6657205Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_halide.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:34:23.665368] 2025-08-14T21:34:30.8927583Z 2025-08-14T21:34:30.8928765Z inductor/test_halide 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_halide_1.1_20c94affc301161d_.log 2025-08-14T21:34:30.8929362Z 2025-08-14T21:34:30.8931380Z Running inductor/test_triton_extension_backend 1/1 ... [2025-08-14 21:34:30.892812] 2025-08-14T21:34:30.8932002Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:34:30.8935354Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_triton_extension_backend.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:34:30.893176] 2025-08-14T21:34:38.2704862Z 2025-08-14T21:34:38.2705836Z inductor/test_triton_extension_backend 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_triton_extension_backend_1.1_45e669d54ef88616_.log 2025-08-14T21:34:38.2706655Z Running 0 items in this shard: 2025-08-14T21:34:38.2706835Z 2025-08-14T21:34:38.2708884Z Running inductor/test_autoheuristic 1/1 ... [2025-08-14 21:34:38.270596] 2025-08-14T21:34:38.2709323Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:34:38.2712983Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_autoheuristic.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:34:38.270958] 2025-08-14T21:34:45.3979022Z 2025-08-14T21:34:45.3980361Z inductor/test_autoheuristic 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_autoheuristic_1.1_10c8e3ab7860f4ed_.log 2025-08-14T21:34:45.3981123Z Running 0 items in this shard: 2025-08-14T21:34:45.3981309Z 2025-08-14T21:34:45.3982886Z Running inductor/test_mps_basic 1/1 ... [2025-08-14 21:34:45.398012] 2025-08-14T21:34:45.3983299Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:34:45.3987003Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_mps_basic.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:34:45.398379] 2025-08-14T21:34:52.6263135Z 2025-08-14T21:34:52.6264016Z inductor/test_mps_basic 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_mps_basic_1.1_6369e8c9d1e3e816_.log 2025-08-14T21:34:52.6264628Z 2025-08-14T21:34:52.6267627Z Running dynamo/test_precompile_context 1/1 ... [2025-08-14 21:34:52.626465] 2025-08-14T21:34:52.6268069Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:34:52.6272223Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_precompile_context.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:34:52.626891] 2025-08-14T21:34:59.2523462Z 2025-08-14T21:34:59.2524303Z dynamo/test_precompile_context 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_precompile_context_1.1_d12ecc8d077c3000_.log 2025-08-14T21:34:59.2525120Z Running 0 items in this shard: 2025-08-14T21:34:59.2525307Z 2025-08-14T21:34:59.2527146Z Running dynamo/test_cudagraphs_expandable_segments 1/1 ... [2025-08-14 21:34:59.252383] 2025-08-14T21:34:59.2527635Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:34:59.2530860Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_cudagraphs_expandable_segments.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:34:59.252723] 2025-08-14T21:35:03.6252904Z 2025-08-14T21:35:03.6253924Z dynamo/test_cudagraphs_expandable_segments 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_cudagraphs_expandable_segments_1.1_2b73e64dd0faa39c_.log 2025-08-14T21:35:03.6254803Z Running 0 items in this shard: 2025-08-14T21:35:03.6255021Z 2025-08-14T21:35:03.6256940Z Running dynamo/test_guard_manager 1/1 ... [2025-08-14 21:35:03.625412] 2025-08-14T21:35:03.6257370Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:35:03.6261202Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_guard_manager.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:35:03.625781] 2025-08-14T21:35:07.5983043Z 2025-08-14T21:35:07.5984056Z dynamo/test_guard_manager 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_guard_manager_1.1_cbdf2edbea03f0a9_.log 2025-08-14T21:35:07.5984895Z Running 0 items in this shard: 2025-08-14T21:35:07.5985075Z 2025-08-14T21:35:07.5985295Z Running test_sparse_semi_structured 1/1 ... [2025-08-14 21:35:07.598016] 2025-08-14T21:35:07.5985692Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:35:07.5987587Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_sparse_semi_structured.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:35:07.598352] 2025-08-14T21:35:14.5799375Z 2025-08-14T21:35:14.5800710Z test_sparse_semi_structured 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_sparse_semi_structured_1.1_4492803570b5cb8a_.log 2025-08-14T21:35:14.5801718Z Running 0 items in this shard: 2025-08-14T21:35:14.5801907Z 2025-08-14T21:35:14.5803661Z Running export/test_serialize 1/1 ... [2025-08-14 21:35:14.580028] 2025-08-14T21:35:14.5804189Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:35:14.5807302Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_serialize.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:35:14.580363] 2025-08-14T21:35:18.7523393Z 2025-08-14T21:35:18.7524161Z export/test_serialize 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_serialize_1.1_381bdbeaf5980b0e_.log 2025-08-14T21:35:18.7524947Z Running 0 items in this shard: 2025-08-14T21:35:18.7525127Z 2025-08-14T21:35:18.7527609Z Running dynamo/test_structured_trace 1/1 ... [2025-08-14 21:35:18.752460] 2025-08-14T21:35:18.7528043Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:35:18.7531858Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_structured_trace.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:35:18.752856] 2025-08-14T21:35:25.5799475Z 2025-08-14T21:35:25.5800600Z dynamo/test_structured_trace 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_structured_trace_1.1_ad7791956a1e129f_.log 2025-08-14T21:35:25.5801499Z Running 0 items in this shard: 2025-08-14T21:35:25.5801690Z 2025-08-14T21:35:25.5801873Z Running export/test_verifier 1/1 ... [2025-08-14 21:35:25.579839] 2025-08-14T21:35:25.5802258Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:35:25.5805043Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_verifier.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:35:25.580173] 2025-08-14T21:35:29.5520347Z 2025-08-14T21:35:29.5521233Z export/test_verifier 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_verifier_1.1_cca23f96c748147f_.log 2025-08-14T21:35:29.5522145Z Running 0 items in this shard: 2025-08-14T21:35:29.5522397Z 2025-08-14T21:35:29.5524357Z Running dynamo/test_recompile_ux 1/1 ... [2025-08-14 21:35:29.552144] 2025-08-14T21:35:29.5524889Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:35:29.5528178Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_recompile_ux.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:35:29.552492] 2025-08-14T21:35:33.7245278Z 2025-08-14T21:35:33.7246092Z dynamo/test_recompile_ux 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_recompile_ux_1.1_5192c675fc088558_.log 2025-08-14T21:35:33.7247337Z Running 0 items in this shard: 2025-08-14T21:35:33.7247516Z 2025-08-14T21:35:33.7249179Z Running dynamo/test_minifier 1/1 ... [2025-08-14 21:35:33.724619] 2025-08-14T21:35:33.7249560Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:35:33.7253146Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_minifier.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:35:33.724999] 2025-08-14T21:35:38.0977030Z 2025-08-14T21:35:38.0977777Z dynamo/test_minifier 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_minifier_1.1_3616437461683997_.log 2025-08-14T21:35:38.0978926Z Running 0 items in this shard: 2025-08-14T21:35:38.0979207Z 2025-08-14T21:35:38.0980804Z Running inductor/test_cudagraph_trees_expandable_segments 1/1 ... [2025-08-14 21:35:38.097785] 2025-08-14T21:35:38.0981299Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:35:38.0984373Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cudagraph_trees_expandable_segments.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:35:38.098117] 2025-08-14T21:35:44.8749876Z 2025-08-14T21:35:44.8750946Z inductor/test_cudagraph_trees_expandable_segments 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cudagraph_trees_expandable_segments_1.1_2393cada8ea081e6_.log 2025-08-14T21:35:44.8752166Z Running 0 items in this shard: 2025-08-14T21:35:44.8752348Z 2025-08-14T21:35:44.8753665Z Running test_functionalization_of_rng_ops 1/1 ... [2025-08-14 21:35:44.874997] 2025-08-14T21:35:44.8754118Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:35:44.8757997Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_functionalization_of_rng_ops.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:35:44.875332] 2025-08-14T21:35:49.4481671Z 2025-08-14T21:35:49.4483328Z test_functionalization_of_rng_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_functionalization_of_rng_ops_1.1_f2ac459ebeb667b7_.log 2025-08-14T21:35:49.4484130Z Running 0 items in this shard: 2025-08-14T21:35:49.4484318Z 2025-08-14T21:35:49.4485151Z Running dynamo/test_hooks 1/1 ... [2025-08-14 21:35:49.448254] 2025-08-14T21:35:49.4485575Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:35:49.4489289Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_hooks.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:35:49.448600] 2025-08-14T21:35:53.6204296Z 2025-08-14T21:35:53.6205138Z dynamo/test_hooks 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_hooks_1.1_eed85e2d259511ff_.log 2025-08-14T21:35:53.6205788Z Running 0 items in this shard: 2025-08-14T21:35:53.6205973Z 2025-08-14T21:35:53.6208001Z Running dynamo/test_generator 1/1 ... [2025-08-14 21:35:53.620407] 2025-08-14T21:35:53.6208408Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:35:53.6211121Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_generator.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:35:53.620750] 2025-08-14T21:35:57.7926468Z 2025-08-14T21:35:57.7927031Z dynamo/test_generator 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_generator_1.1_07c9831a7f3865b7_.log 2025-08-14T21:35:57.7927955Z Running 0 items in this shard: 2025-08-14T21:35:57.7928135Z 2025-08-14T21:35:57.7930465Z Running dynamo/test_reorder_logs 1/1 ... [2025-08-14 21:35:57.792717] 2025-08-14T21:35:57.7930877Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:35:57.7933657Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_reorder_logs.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:35:57.793054] 2025-08-14T21:36:01.9653846Z 2025-08-14T21:36:01.9655021Z dynamo/test_reorder_logs 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_reorder_logs_1.1_4a05519ab7b741a8_.log 2025-08-14T21:36:01.9656180Z Running 0 items in this shard: 2025-08-14T21:36:01.9656374Z 2025-08-14T21:36:01.9658750Z Running dynamo/test_subclasses 1/1 ... [2025-08-14 21:36:01.965593] 2025-08-14T21:36:01.9659168Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:36:01.9663266Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_subclasses.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:36:01.965986] 2025-08-14T21:36:08.8429521Z 2025-08-14T21:36:08.8430127Z dynamo/test_subclasses 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_subclasses_1.1_25bbe11b539e35b7_.log 2025-08-14T21:36:08.8430974Z Running 0 items in this shard: 2025-08-14T21:36:08.8431160Z 2025-08-14T21:36:08.8433517Z Running export/test_sparse 1/1 ... [2025-08-14 21:36:08.843064] 2025-08-14T21:36:08.8433915Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:36:08.8437219Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_sparse.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:36:08.843407] 2025-08-14T21:36:12.8151562Z 2025-08-14T21:36:12.8152375Z export/test_sparse 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_sparse_1.1_062c01233bbe2a29_.log 2025-08-14T21:36:12.8153108Z Running 0 items in this shard: 2025-08-14T21:36:12.8153289Z 2025-08-14T21:36:12.8155298Z Running functorch/test_eager_transforms 1/1 ... [2025-08-14 21:36:12.815101] 2025-08-14T21:36:12.8155825Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:36:12.8157553Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_eager_transforms.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:36:12.815442] 2025-08-14T21:36:18.5409177Z 2025-08-14T21:36:18.5410191Z functorch/test_eager_transforms 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_eager_transforms_1.1_1c868ebc237dc36c_.log 2025-08-14T21:36:18.5410962Z Running 0 items in this shard: 2025-08-14T21:36:18.5411150Z 2025-08-14T21:36:18.5412934Z Running export/test_draft_export 1/1 ... [2025-08-14 21:36:18.540999] 2025-08-14T21:36:18.5413337Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:36:18.5416844Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_draft_export.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:36:18.541330] 2025-08-14T21:36:22.7632415Z 2025-08-14T21:36:22.7633241Z export/test_draft_export 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_draft_export_1.1_209b8ecd104b73f2_.log 2025-08-14T21:36:22.7634352Z Running 0 items in this shard: 2025-08-14T21:36:22.7634542Z 2025-08-14T21:36:22.7636205Z Running dynamo/test_comptime 1/1 ... [2025-08-14 21:36:22.763317] 2025-08-14T21:36:22.7637389Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:36:22.7639836Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_comptime.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:36:22.763665] 2025-08-14T21:36:26.9361692Z 2025-08-14T21:36:26.9362514Z dynamo/test_comptime 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_comptime_1.1_dbfcdeda804ffc40_.log 2025-08-14T21:36:26.9363220Z Running 0 items in this shard: 2025-08-14T21:36:26.9363402Z 2025-08-14T21:36:26.9366223Z Running dynamo/test_python_autograd 1/1 ... [2025-08-14 21:36:26.936288] 2025-08-14T21:36:26.9366639Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:36:26.9369907Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_python_autograd.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:36:26.936648] 2025-08-14T21:36:31.1089401Z 2025-08-14T21:36:31.1090162Z dynamo/test_python_autograd 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_python_autograd_1.1_b0d85c8deeb69608_.log 2025-08-14T21:36:31.1090904Z Running 0 items in this shard: 2025-08-14T21:36:31.1091086Z 2025-08-14T21:36:31.1093441Z Running test_quantization 1/4 ... [2025-08-14 21:36:31.109081] 2025-08-14T21:36:31.1093822Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:36:31.1097622Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_quantization.py', '-m', 'serial', '--shard-id=1', '--num-shards=4', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:36:31.109455] 2025-08-14T21:36:37.1345562Z 2025-08-14T21:36:37.1346310Z test_quantization 1/4 was successful, full logs can be found in artifacts with path test/test-reports/test_quantization_1.4_9d99d25d6b9bebf5_.log 2025-08-14T21:36:37.1347000Z Running 0 items in this shard: 2025-08-14T21:36:37.1347487Z 2025-08-14T21:36:37.1349253Z Running test_dynamic_shapes 1/1 ... [2025-08-14 21:36:37.134647] 2025-08-14T21:36:37.1349628Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:36:37.1353035Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_dynamic_shapes.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:36:37.135007] 2025-08-14T21:36:41.3570651Z 2025-08-14T21:36:41.3571623Z test_dynamic_shapes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_dynamic_shapes_1.1_f9cb3f8d1fac3562_.log 2025-08-14T21:36:41.3572314Z Running 0 items in this shard: 2025-08-14T21:36:41.3572495Z 2025-08-14T21:36:41.3575003Z Running higher_order_ops/test_invoke_subgraph 1/1 ... [2025-08-14 21:36:41.357203] 2025-08-14T21:36:41.3575509Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:36:41.3578889Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'higher_order_ops/test_invoke_subgraph.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:36:41.357554] 2025-08-14T21:36:48.2341920Z 2025-08-14T21:36:48.2342819Z higher_order_ops/test_invoke_subgraph 1/1 was successful, full logs can be found in artifacts with path test/test-reports/higher_order_ops.test_invoke_subgraph_1.1_b76a8da441fbc091_.log 2025-08-14T21:36:48.2343761Z Running 0 items in this shard: 2025-08-14T21:36:48.2343951Z 2025-08-14T21:36:48.2345426Z Running test_package 1/1 ... [2025-08-14 21:36:48.234170] 2025-08-14T21:36:48.2345809Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:36:48.2348249Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_package.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:36:48.234509] 2025-08-14T21:36:52.3565393Z 2025-08-14T21:36:52.3566446Z test_package 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_package_1.1_a76776829667ac1d_.log 2025-08-14T21:36:52.3567068Z Running 0 items in this shard: 2025-08-14T21:36:52.3567244Z 2025-08-14T21:36:52.3569298Z Running functorch/test_rearrange 1/1 ... [2025-08-14 21:36:52.356622] 2025-08-14T21:36:52.3569863Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:36:52.3573573Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_rearrange.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:36:52.356974] 2025-08-14T21:36:56.2288126Z 2025-08-14T21:36:56.2289179Z functorch/test_rearrange 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_rearrange_1.1_7502235f9b58e1be_.log 2025-08-14T21:36:56.2290062Z Running 0 items in this shard: 2025-08-14T21:36:56.2290251Z 2025-08-14T21:36:56.2291486Z Running functorch/test_parsing 1/1 ... [2025-08-14 21:36:56.228798] 2025-08-14T21:36:56.2292049Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:36:56.2295998Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_parsing.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:36:56.229235] 2025-08-14T21:37:00.0505484Z 2025-08-14T21:37:00.0506685Z functorch/test_parsing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_parsing_1.1_c39affa5939ca9f1_.log 2025-08-14T21:37:00.0507397Z Running 0 items in this shard: 2025-08-14T21:37:00.0507607Z 2025-08-14T21:37:00.0509677Z Running test_hop_infra 1/1 ... [2025-08-14 21:37:00.050653] 2025-08-14T21:37:00.0510180Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:37:00.0514056Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_hop_infra.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:37:00.051033] 2025-08-14T21:37:04.7236945Z 2025-08-14T21:37:04.7237890Z test_hop_infra 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_hop_infra_1.1_d1bb2d795a77e2ba_.log 2025-08-14T21:37:04.7238640Z Running 0 items in this shard: 2025-08-14T21:37:04.7238833Z 2025-08-14T21:37:04.7245269Z Running test_comparison_utils 1/1 ... [2025-08-14 21:37:04.724185] 2025-08-14T21:37:04.7245803Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:37:04.7249467Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_comparison_utils.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:37:04.724566] 2025-08-14T21:37:08.5970648Z 2025-08-14T21:37:08.5981050Z test_comparison_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_comparison_utils_1.1_1b8b221e45cbcafa_.log 2025-08-14T21:37:08.5981941Z Running 0 items in this shard: 2025-08-14T21:37:08.5982196Z 2025-08-14T21:37:08.5982474Z Running functorch/test_control_flow 1/1 ... [2025-08-14 21:37:08.596731] 2025-08-14T21:37:08.5983006Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:37:08.5984023Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_control_flow.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:37:08.597057] 2025-08-14T21:37:13.6710143Z 2025-08-14T21:37:13.6711245Z functorch/test_control_flow 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_control_flow_1.1_f1c17bc87f00f757_.log 2025-08-14T21:37:13.6712017Z Running 0 items in this shard: 2025-08-14T21:37:13.6712263Z 2025-08-14T21:37:13.6712566Z Running functorch/test_ac_logging 1/1 ... [2025-08-14 21:37:13.670805] 2025-08-14T21:37:13.6713121Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:37:13.6714749Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_ac_logging.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:37:13.671131] 2025-08-14T21:37:17.4929151Z 2025-08-14T21:37:17.4930294Z functorch/test_ac_logging 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_ac_logging_1.1_28fd3d4086bed1e0_.log 2025-08-14T21:37:17.4931056Z Running 0 items in this shard: 2025-08-14T21:37:17.4931245Z 2025-08-14T21:37:17.4932040Z Running test_mkl_verbose 1/1 ... [2025-08-14 21:37:17.492960] 2025-08-14T21:37:17.4932550Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:37:17.4936242Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_mkl_verbose.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:37:17.493281] 2025-08-14T21:37:21.3150011Z 2025-08-14T21:37:21.3150855Z test_mkl_verbose 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_mkl_verbose_1.1_f614497f0410631b_.log 2025-08-14T21:37:21.3151518Z Running 0 items in this shard: 2025-08-14T21:37:21.3151704Z 2025-08-14T21:37:21.3153876Z Running test_appending_byte_serializer 1/1 ... [2025-08-14 21:37:21.315108] 2025-08-14T21:37:21.3154317Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:37:21.3157535Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_appending_byte_serializer.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:37:21.315430] 2025-08-14T21:37:25.1370822Z 2025-08-14T21:37:25.1372023Z test_appending_byte_serializer 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_appending_byte_serializer_1.1_bad8dde96e09343c_.log 2025-08-14T21:37:25.1373054Z Running 0 items in this shard: 2025-08-14T21:37:25.1373264Z 2025-08-14T21:37:25.1374264Z Running test_autoload 1/1 ... [2025-08-14 21:37:25.137163] 2025-08-14T21:37:25.1374768Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:37:25.1378396Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_autoload.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:37:25.137487] 2025-08-14T21:37:28.9588963Z 2025-08-14T21:37:28.9590068Z test_autoload 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_autoload_1.1_292d270800684cbb_.log 2025-08-14T21:37:28.9590791Z Running 0 items in this shard: 2025-08-14T21:37:28.9590969Z 2025-08-14T21:37:28.9592561Z Running test_proxy_tensor 1/1 ... [2025-08-14 21:37:28.958956] 2025-08-14T21:37:28.9593075Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:37:28.9596450Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_proxy_tensor.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:37:28.959300] 2025-08-14T21:37:34.7343439Z 2025-08-14T21:37:34.7344193Z test_proxy_tensor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_proxy_tensor_1.1_b158154d5a1d679d_.log 2025-08-14T21:37:34.7344861Z Running 0 items in this shard: 2025-08-14T21:37:34.7345044Z 2025-08-14T21:37:34.7347408Z Running test_mkldnn_verbose 1/1 ... [2025-08-14 21:37:34.734473] 2025-08-14T21:37:34.7347904Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:37:34.7351674Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_mkldnn_verbose.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:37:34.734816] 2025-08-14T21:37:38.5563363Z 2025-08-14T21:37:38.5564164Z test_mkldnn_verbose 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_mkldnn_verbose_1.1_34ab693c2987bebd_.log 2025-08-14T21:37:38.5564922Z Running 0 items in this shard: 2025-08-14T21:37:38.5565113Z 2025-08-14T21:37:38.5566686Z Running test_utils_config_module 1/1 ... [2025-08-14 21:37:38.556370] 2025-08-14T21:37:38.5567093Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:37:38.5570410Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_utils_config_module.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:37:38.556724] 2025-08-14T21:37:42.4789503Z 2025-08-14T21:37:42.4790419Z test_utils_config_module 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_utils_config_module_1.1_bd5ae5652df018f0_.log 2025-08-14T21:37:42.4791252Z Running 0 items in this shard: 2025-08-14T21:37:42.4791437Z 2025-08-14T21:37:42.4793063Z Running test_functionalization 1/1 ... [2025-08-14 21:37:42.479040] 2025-08-14T21:37:42.4793481Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:37:42.4796653Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_functionalization.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:37:42.479365] 2025-08-14T21:37:46.4529143Z 2025-08-14T21:37:46.4530028Z test_functionalization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_functionalization_1.1_fb23d506bf758cc2_.log 2025-08-14T21:37:46.4530731Z Running 0 items in this shard: 2025-08-14T21:37:46.4530915Z 2025-08-14T21:37:46.4531173Z Running test_rename_privateuse1_to_existing_device 1/1 ... [2025-08-14 21:37:46.452811] 2025-08-14T21:37:46.4531626Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:37:46.4535263Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_rename_privateuse1_to_existing_device.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:37:46.453177] 2025-08-14T21:37:50.2753119Z 2025-08-14T21:37:50.2754083Z test_rename_privateuse1_to_existing_device 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_rename_privateuse1_to_existing_device_1.1_2d86974612dfb5d2_.log 2025-08-14T21:37:50.2754935Z Running 0 items in this shard: 2025-08-14T21:37:50.2755117Z 2025-08-14T21:37:50.2756917Z Running test_transformers 1/1 ... [2025-08-14 21:37:50.275425] 2025-08-14T21:37:50.2757381Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:37:50.2760890Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_transformers.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:37:50.275777] 2025-08-14T21:37:58.4047390Z 2025-08-14T21:37:58.4048357Z test_transformers 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_transformers_1.1_b0c8acb976b034f3_.log 2025-08-14T21:37:58.4049009Z Running 0 items in this shard: 2025-08-14T21:37:58.4049197Z 2025-08-14T21:37:58.4050488Z Running test_ao_sparsity 1/1 ... [2025-08-14 21:37:58.404603] 2025-08-14T21:37:58.4050870Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:37:58.4052954Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ao_sparsity.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:37:58.404969] 2025-08-14T21:38:02.7774574Z 2025-08-14T21:38:02.7775694Z test_ao_sparsity 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_ao_sparsity_1.1_9b1fb0198dc0af7f_.log 2025-08-14T21:38:02.7776352Z Running 0 items in this shard: 2025-08-14T21:38:02.7778510Z 2025-08-14T21:38:02.7779244Z Running test_license 1/1 ... [2025-08-14 21:38:02.777521] 2025-08-14T21:38:02.7779620Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:38:02.7782325Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_license.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:38:02.777909] 2025-08-14T21:38:06.5995288Z 2025-08-14T21:38:06.5996104Z test_license 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_license_1.1_d0d720b3950d939a_.log 2025-08-14T21:38:06.5996729Z Running 0 items in this shard: 2025-08-14T21:38:06.5996910Z 2025-08-14T21:38:06.5998570Z Running torch_np/test_binary_ufuncs 1/1 ... [2025-08-14 21:38:06.599552] 2025-08-14T21:38:06.5998983Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:38:06.6003314Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/test_binary_ufuncs.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:38:06.599896] 2025-08-14T21:38:10.5717790Z 2025-08-14T21:38:10.5718751Z torch_np/test_binary_ufuncs 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.test_binary_ufuncs_1.1_320e65142ee686e8_.log 2025-08-14T21:38:10.5719495Z Running 0 items in this shard: 2025-08-14T21:38:10.5719687Z 2025-08-14T21:38:10.5722456Z Running torch_np/test_unary_ufuncs 1/1 ... [2025-08-14 21:38:10.571944] 2025-08-14T21:38:10.5722976Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:38:10.5726682Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/test_unary_ufuncs.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:38:10.572322] 2025-08-14T21:38:14.4945140Z 2025-08-14T21:38:14.4946062Z torch_np/test_unary_ufuncs 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.test_unary_ufuncs_1.1_76b5bdb4001fef58_.log 2025-08-14T21:38:14.4946792Z Running 0 items in this shard: 2025-08-14T21:38:14.4946973Z 2025-08-14T21:38:14.4948951Z Running test_foreach 1/1 ... [2025-08-14 21:38:14.494599] 2025-08-14T21:38:14.4949419Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:38:14.4953372Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_foreach.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:38:14.494998] 2025-08-14T21:38:23.0761695Z 2025-08-14T21:38:23.0762482Z test_foreach 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_foreach_1.1_badc3767e4b1644b_.log 2025-08-14T21:38:23.0763422Z Running 0 items in this shard: 2025-08-14T21:38:23.0763612Z 2025-08-14T21:38:23.0765590Z Running test_expanded_weights 1/1 ... [2025-08-14 21:38:23.076268] 2025-08-14T21:38:23.0766082Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:38:23.0769387Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_expanded_weights.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:38:23.076611] 2025-08-14T21:38:28.5508469Z 2025-08-14T21:38:28.5509225Z test_expanded_weights 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_expanded_weights_1.1_2d9d6939094d86e9_.log 2025-08-14T21:38:28.5509921Z Running 0 items in this shard: 2025-08-14T21:38:28.5510106Z 2025-08-14T21:38:28.5512362Z Running typing/test_python_operators 1/1 ... [2025-08-14 21:38:28.550924] 2025-08-14T21:38:28.5512798Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:38:28.5515801Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'typing/test_python_operators.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:38:28.551257] 2025-08-14T21:38:32.5230215Z 2025-08-14T21:38:32.5231164Z typing/test_python_operators 1/1 was successful, full logs can be found in artifacts with path test/test-reports/typing.test_python_operators_1.1_68470c26d8430657_.log 2025-08-14T21:38:32.5231966Z Running 0 items in this shard: 2025-08-14T21:38:32.5232142Z 2025-08-14T21:38:32.5233770Z Running test_fx_experimental 1/1 ... [2025-08-14 21:38:32.523075] 2025-08-14T21:38:32.5234276Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:38:32.5237441Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_fx_experimental.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:38:32.523408] 2025-08-14T21:38:38.2486850Z 2025-08-14T21:38:38.2487587Z test_fx_experimental 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_fx_experimental_1.1_a8cce33eaa1086ce_.log 2025-08-14T21:38:38.2488294Z Running 0 items in this shard: 2025-08-14T21:38:38.2488482Z 2025-08-14T21:38:38.2490526Z Running test_file_check 1/1 ... [2025-08-14 21:38:38.248759] 2025-08-14T21:38:38.2490910Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:38:38.2494475Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_file_check.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:38:38.249112] 2025-08-14T21:38:42.0709194Z 2025-08-14T21:38:42.0710772Z test_file_check 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_file_check_1.1_c90f937f7159c024_.log 2025-08-14T21:38:42.0712544Z Running 0 items in this shard: 2025-08-14T21:38:42.0714770Z 2025-08-14T21:38:42.0716529Z Running torch_np/test_dtype 1/1 ... [2025-08-14 21:38:42.071314] 2025-08-14T21:38:42.0717110Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:38:42.0721263Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/test_dtype.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:38:42.071774] 2025-08-14T21:38:46.0435909Z 2025-08-14T21:38:46.0436732Z torch_np/test_dtype 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.test_dtype_1.1_abb4dda8192a1528_.log 2025-08-14T21:38:46.0437440Z Running 0 items in this shard: 2025-08-14T21:38:46.0437634Z 2025-08-14T21:38:46.0438048Z Running higher_order_ops/test_with_effects 1/1 ... [2025-08-14 21:38:46.043541] 2025-08-14T21:38:46.0438773Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:38:46.0442753Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'higher_order_ops/test_with_effects.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:38:46.043942] 2025-08-14T21:38:50.8175194Z 2025-08-14T21:38:50.8176806Z higher_order_ops/test_with_effects 1/1 was successful, full logs can be found in artifacts with path test/test-reports/higher_order_ops.test_with_effects_1.1_352a8a0d1c12b1d4_.log 2025-08-14T21:38:50.8177759Z Running 0 items in this shard: 2025-08-14T21:38:50.8177939Z 2025-08-14T21:38:50.8180497Z Running test_native_functions 1/1 ... [2025-08-14 21:38:50.817769] 2025-08-14T21:38:50.8180893Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:38:50.8184806Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_native_functions.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:38:50.818115] 2025-08-14T21:38:54.6398789Z 2025-08-14T21:38:54.6399784Z test_native_functions 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_native_functions_1.1_e97c2556dc14a754_.log 2025-08-14T21:38:54.6400474Z Running 0 items in this shard: 2025-08-14T21:38:54.6400658Z 2025-08-14T21:38:54.6401157Z Running functorch/test_minifier 1/1 ... [2025-08-14 21:38:54.639873] 2025-08-14T21:38:54.6401644Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:38:54.6405314Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_minifier.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:38:54.640206] 2025-08-14T21:38:58.8126758Z 2025-08-14T21:38:58.8127776Z functorch/test_minifier 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_minifier_1.1_e5c6449634bc41ef_.log 2025-08-14T21:38:58.8128494Z Running 0 items in this shard: 2025-08-14T21:38:58.8128675Z 2025-08-14T21:38:58.8130214Z Running test_flop_counter 1/1 ... [2025-08-14 21:38:58.812741] 2025-08-14T21:38:58.8130703Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:38:58.8134219Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_flop_counter.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:38:58.813107] 2025-08-14T21:39:03.3357283Z 2025-08-14T21:39:03.3358693Z test_flop_counter 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_flop_counter_1.1_fa3f55f5fd24ddc9_.log 2025-08-14T21:39:03.3359808Z Running 0 items in this shard: 2025-08-14T21:39:03.3360128Z 2025-08-14T21:39:03.3361254Z Running torch_np/test_nep50_examples 1/1 ... [2025-08-14 21:39:03.335846] 2025-08-14T21:39:03.3361674Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:39:03.3366354Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/test_nep50_examples.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:39:03.336278] 2025-08-14T21:39:07.4584233Z 2025-08-14T21:39:07.4585321Z torch_np/test_nep50_examples 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.test_nep50_examples_1.1_2387ae9205898e1c_.log 2025-08-14T21:39:07.4586058Z Running 0 items in this shard: 2025-08-14T21:39:07.4586244Z 2025-08-14T21:39:07.4588099Z Running higher_order_ops/test_invoke_quant 1/1 ... [2025-08-14 21:39:07.458482] 2025-08-14T21:39:07.4588576Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:39:07.4591761Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'higher_order_ops/test_invoke_quant.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:39:07.458837] 2025-08-14T21:39:14.0865212Z 2025-08-14T21:39:14.0866082Z higher_order_ops/test_invoke_quant 1/1 was successful, full logs can be found in artifacts with path test/test-reports/higher_order_ops.test_invoke_quant_1.1_e1ac74ec1b803681_.log 2025-08-14T21:39:14.0866866Z Running 0 items in this shard: 2025-08-14T21:39:14.0867050Z 2025-08-14T21:39:14.0868985Z Running test_per_overload_api 1/1 ... [2025-08-14 21:39:14.086640] 2025-08-14T21:39:14.0869544Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:39:14.0873486Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_per_overload_api.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:39:14.087006] 2025-08-14T21:39:17.9086523Z 2025-08-14T21:39:17.9087620Z test_per_overload_api 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_per_overload_api_1.1_41d416732ec83204_.log 2025-08-14T21:39:17.9088809Z Running 0 items in this shard: 2025-08-14T21:39:17.9089102Z 2025-08-14T21:39:17.9092594Z Running test_tensorexpr_pybind 1/1 ... [2025-08-14 21:39:17.908941] 2025-08-14T21:39:17.9093176Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:39:17.9096789Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_tensorexpr_pybind.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:39:17.909314] 2025-08-14T21:39:21.8814464Z 2025-08-14T21:39:21.8815739Z test_tensorexpr_pybind 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_tensorexpr_pybind_1.1_b4c14d6fcc87ecc7_.log 2025-08-14T21:39:21.8817071Z Running 0 items in this shard: 2025-08-14T21:39:21.8817344Z 2025-08-14T21:39:21.8817680Z Running test_pytree 1/1 ... [2025-08-14 21:39:21.881476] 2025-08-14T21:39:21.8818056Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:39:21.8822051Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_pytree.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:39:21.881897] 2025-08-14T21:39:25.8540315Z 2025-08-14T21:39:25.8541235Z test_pytree 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_pytree_1.1_ec340c49cd9d473f_.log 2025-08-14T21:39:25.8541864Z Running 0 items in this shard: 2025-08-14T21:39:25.8542053Z 2025-08-14T21:39:25.8542871Z Running test_fx_passes 1/1 ... [2025-08-14 21:39:25.854032] 2025-08-14T21:39:25.8543244Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:39:25.8547538Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_fx_passes.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:39:25.854416] 2025-08-14T21:39:29.8760985Z 2025-08-14T21:39:29.8761724Z test_fx_passes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_fx_passes_1.1_1c67b10632292282_.log 2025-08-14T21:39:29.8762448Z Running 0 items in this shard: 2025-08-14T21:39:29.8762634Z 2025-08-14T21:39:29.8764211Z Running test_module_tracker 1/1 ... [2025-08-14 21:39:29.876136] 2025-08-14T21:39:29.8764676Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:39:29.8769117Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_module_tracker.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:39:29.876577] 2025-08-14T21:39:33.6986422Z 2025-08-14T21:39:33.6987558Z test_module_tracker 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_module_tracker_1.1_b7fd2034efdc73f1_.log 2025-08-14T21:39:33.6988263Z Running 0 items in this shard: 2025-08-14T21:39:33.6988450Z 2025-08-14T21:39:33.6993094Z Running test_typing 1/1 ... [2025-08-14 21:39:33.699040] 2025-08-14T21:39:33.6993446Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:39:33.6996855Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_typing.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:39:33.699366] 2025-08-14T21:39:37.6716052Z 2025-08-14T21:39:37.6717078Z test_typing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_typing_1.1_6e540a0c3fe3fbc3_.log 2025-08-14T21:39:37.6717729Z Running 0 items in this shard: 2025-08-14T21:39:37.6717911Z 2025-08-14T21:39:37.6719126Z Running test_openmp 1/1 ... [2025-08-14 21:39:37.671623] 2025-08-14T21:39:37.6719502Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:39:37.6723543Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_openmp.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:39:37.671954] 2025-08-14T21:39:41.6441586Z 2025-08-14T21:39:41.6443196Z test_openmp 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_openmp_1.1_72ce3994a9a72144_.log 2025-08-14T21:39:41.6443930Z Running 0 items in this shard: 2025-08-14T21:39:41.6444118Z 2025-08-14T21:39:41.6447497Z Running torch_np/numpy_tests/core/test_scalarinherit 1/1 ... [2025-08-14 21:39:41.644457] 2025-08-14T21:39:41.6447976Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:39:41.6451374Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_scalarinherit.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:39:41.644799] 2025-08-14T21:39:45.4666094Z 2025-08-14T21:39:45.4667083Z torch_np/numpy_tests/core/test_scalarinherit 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_scalarinherit_1.1_ec9ce57b68ffd52a_.log 2025-08-14T21:39:45.4667946Z Running 0 items in this shard: 2025-08-14T21:39:45.4668142Z 2025-08-14T21:39:45.4669645Z Running functorch/test_ac_knapsack 1/1 ... [2025-08-14 21:39:45.466625] 2025-08-14T21:39:45.4670066Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:39:45.4673222Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_ac_knapsack.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:39:45.466958] 2025-08-14T21:39:49.5392661Z 2025-08-14T21:39:49.5393631Z functorch/test_ac_knapsack 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_ac_knapsack_1.1_e99acd62d792888f_.log 2025-08-14T21:39:49.5394373Z Running 0 items in this shard: 2025-08-14T21:39:49.5394559Z 2025-08-14T21:39:49.5394793Z Running backends/xeon/test_launch 1/1 ... [2025-08-14 21:39:49.539225] 2025-08-14T21:39:49.5395199Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:39:49.5399002Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'backends/xeon/test_launch.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:39:49.539564] 2025-08-14T21:39:53.3612807Z 2025-08-14T21:39:53.3615224Z backends/xeon/test_launch 1/1 was successful, full logs can be found in artifacts with path test/test-reports/backends.xeon.test_launch_1.1_5b8208ef42906249_.log 2025-08-14T21:39:53.3616785Z Running 0 items in this shard: 2025-08-14T21:39:53.3617008Z 2025-08-14T21:39:53.3617184Z Running test_nestedtensor 1/1 ... [2025-08-14 21:39:53.361233] 2025-08-14T21:39:53.3617594Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:39:53.3618773Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_nestedtensor.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:39:53.361558] 2025-08-14T21:39:59.5377006Z 2025-08-14T21:39:59.5377750Z test_nestedtensor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_nestedtensor_1.1_03c1ce670700ec21_.log 2025-08-14T21:39:59.5379154Z Running 1 items in this shard: test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_linear_backward_memory_usage_cuda_float32 2025-08-14T21:39:59.5379721Z 2025-08-14T21:39:59.5382615Z Running test_namedtensor 1/1 ... [2025-08-14 21:39:59.537981] 2025-08-14T21:39:59.5383078Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:39:59.5386356Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_namedtensor.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:39:59.538306] 2025-08-14T21:40:03.7106696Z 2025-08-14T21:40:03.7107622Z test_namedtensor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_namedtensor_1.1_0d923a3a66c31934_.log 2025-08-14T21:40:03.7108453Z Running 0 items in this shard: 2025-08-14T21:40:03.7108638Z 2025-08-14T21:40:03.7110445Z Running xpu/test_gemm 1/1 ... [2025-08-14 21:40:03.710722] 2025-08-14T21:40:03.7110841Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:40:03.7114129Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'xpu/test_gemm.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:40:03.711045] 2025-08-14T21:40:08.0838909Z 2025-08-14T21:40:08.0840200Z xpu/test_gemm 1/1 was successful, full logs can be found in artifacts with path test/test-reports/xpu.test_gemm_1.1_238f31183018d989_.log 2025-08-14T21:40:08.0841441Z Running 0 items in this shard: 2025-08-14T21:40:08.0841705Z 2025-08-14T21:40:08.0841922Z Running torch_np/test_ufuncs_basic 1/1 ... [2025-08-14 21:40:08.083881] 2025-08-14T21:40:08.0842396Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:40:08.0845393Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/test_ufuncs_basic.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:40:08.084198] 2025-08-14T21:40:12.1061508Z 2025-08-14T21:40:12.1062751Z torch_np/test_ufuncs_basic 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.test_ufuncs_basic_1.1_1744d4fb3bce5f13_.log 2025-08-14T21:40:12.1064077Z Running 0 items in this shard: 2025-08-14T21:40:12.1064420Z 2025-08-14T21:40:12.1065242Z Running test_meta 1/1 ... [2025-08-14 21:40:12.106251] 2025-08-14T21:40:12.1065781Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:40:12.1070525Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_meta.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:40:12.106695] 2025-08-14T21:40:26.5457940Z 2025-08-14T21:40:26.5458759Z test_meta 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_meta_1.1_2425b18fc48f3157_.log 2025-08-14T21:40:26.5459884Z Running 0 items in this shard: 2025-08-14T21:40:26.5460107Z 2025-08-14T21:40:26.5461842Z Running test_utils_filelock 1/1 ... [2025-08-14 21:40:26.545890] 2025-08-14T21:40:26.5462218Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:40:26.5465863Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_utils_filelock.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:40:26.546245] 2025-08-14T21:40:30.4178918Z 2025-08-14T21:40:30.4179669Z test_utils_filelock 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_utils_filelock_1.1_687f19e4b3d82991_.log 2025-08-14T21:40:30.4180352Z Running 0 items in this shard: 2025-08-14T21:40:30.4180536Z 2025-08-14T21:40:30.4182459Z Running torch_np/test_random 1/1 ... [2025-08-14 21:40:30.417935] 2025-08-14T21:40:30.4182860Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:40:30.4185638Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/test_random.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:40:30.418254] 2025-08-14T21:40:34.2901050Z 2025-08-14T21:40:34.2901786Z torch_np/test_random 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.test_random_1.1_da908466fd69853a_.log 2025-08-14T21:40:34.2902465Z Running 0 items in this shard: 2025-08-14T21:40:34.2902646Z 2025-08-14T21:40:34.2904572Z Running profiler/test_kineto 1/1 ... [2025-08-14 21:40:34.290198] 2025-08-14T21:40:34.2904980Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:40:34.2908654Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_kineto.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:40:34.290510] 2025-08-14T21:40:38.1118903Z 2025-08-14T21:40:38.1119610Z profiler/test_kineto 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_kineto_1.1_f804d73d6e13a2b5_.log 2025-08-14T21:40:38.1120296Z Running 0 items in this shard: 2025-08-14T21:40:38.1120476Z 2025-08-14T21:40:38.1121927Z Running test_compile_benchmark_util 1/1 ... [2025-08-14 21:40:38.111930] 2025-08-14T21:40:38.1125112Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:40:38.1126097Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_compile_benchmark_util.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:40:38.112255] 2025-08-14T21:40:42.2847335Z 2025-08-14T21:40:42.2848311Z test_compile_benchmark_util 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_compile_benchmark_util_1.1_76645234b3b0d907_.log 2025-08-14T21:40:42.2849067Z Running 0 items in this shard: 2025-08-14T21:40:42.2849249Z 2025-08-14T21:40:42.2850792Z Running test_jiterator 1/1 ... [2025-08-14 21:40:42.284773] 2025-08-14T21:40:42.2851180Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:40:42.2854408Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_jiterator.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:40:42.285102] 2025-08-14T21:40:46.6576038Z 2025-08-14T21:40:46.6576927Z test_jiterator 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_jiterator_1.1_46e7ac5486686c8d_.log 2025-08-14T21:40:46.6577763Z Running 0 items in this shard: 2025-08-14T21:40:46.6577944Z 2025-08-14T21:40:46.6579388Z Running test_optim 1/1 ... [2025-08-14 21:40:46.657650] 2025-08-14T21:40:46.6580053Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:40:46.6582921Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_optim.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:40:46.657977] 2025-08-14T21:40:52.1823507Z 2025-08-14T21:40:52.1824691Z test_optim 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_optim_1.1_19eaa96a8d4541b8_.log 2025-08-14T21:40:52.1825645Z Running 0 items in this shard: 2025-08-14T21:40:52.1825988Z 2025-08-14T21:40:52.1828038Z Running test_decomp 1/16 ... [2025-08-14 21:40:52.182476] 2025-08-14T21:40:52.1828557Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:40:52.1833076Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=1', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:40:52.182911] 2025-08-14T21:40:59.5104551Z 2025-08-14T21:40:59.5105640Z test_decomp 1/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_1.16_b0b09544b76e1c63_.log 2025-08-14T21:40:59.5106580Z Running 0 items in this shard: 2025-08-14T21:40:59.5106845Z 2025-08-14T21:40:59.5108748Z Running test_decomp 5/16 ... [2025-08-14 21:40:59.510540] 2025-08-14T21:40:59.5109298Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:40:59.5111868Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=5', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:40:59.510872] 2025-08-14T21:41:06.8382967Z 2025-08-14T21:41:06.8383850Z test_decomp 5/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_5.16_c11d4d109966cd6c_.log 2025-08-14T21:41:06.8384585Z Running 0 items in this shard: 2025-08-14T21:41:06.8384767Z 2025-08-14T21:41:06.8386335Z Running test_decomp 6/16 ... [2025-08-14 21:41:06.838360] 2025-08-14T21:41:06.8386684Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:41:06.8390262Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=6', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:41:06.838693] 2025-08-14T21:41:14.2217763Z 2025-08-14T21:41:14.2218668Z test_decomp 6/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_6.16_b1f9c052ce035092_.log 2025-08-14T21:41:14.2219417Z Running 0 items in this shard: 2025-08-14T21:41:14.2219597Z 2025-08-14T21:41:14.2221294Z Running test_decomp 10/16 ... [2025-08-14 21:41:14.221838] 2025-08-14T21:41:14.2221694Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:41:14.2224920Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=10', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:41:14.222170] 2025-08-14T21:41:21.5999010Z 2025-08-14T21:41:21.5999832Z test_decomp 10/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_10.16_94caaf3c5dab727a_.log 2025-08-14T21:41:21.6000638Z Running 0 items in this shard: 2025-08-14T21:41:21.6000822Z 2025-08-14T21:41:21.6002522Z Running test_decomp 14/16 ... [2025-08-14 21:41:21.599965] 2025-08-14T21:41:21.6002906Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:41:21.6006770Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=14', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:41:21.600326] 2025-08-14T21:41:28.9283567Z 2025-08-14T21:41:28.9284489Z test_decomp 14/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_14.16_37f87943fae2d21b_.log 2025-08-14T21:41:28.9285220Z Running 0 items in this shard: 2025-08-14T21:41:28.9285413Z 2025-08-14T21:41:28.9287039Z Running test_decomp 15/16 ... [2025-08-14 21:41:28.928379] 2025-08-14T21:41:28.9287400Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:41:28.9290339Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'serial', '--shard-id=15', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:41:28.928727] 2025-08-14T21:41:36.2558861Z 2025-08-14T21:41:36.2560088Z test_decomp 15/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_15.16_0f80eddad8cfd275_.log 2025-08-14T21:41:36.2560724Z Running 0 items in this shard: 2025-08-14T21:41:36.2560941Z 2025-08-14T21:41:36.2562374Z Running test_binary_ufuncs 1/1 ... [2025-08-14 21:41:36.255961] 2025-08-14T21:41:36.2562752Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:41:36.2566257Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_binary_ufuncs.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:41:36.256297] 2025-08-14T21:41:44.3350930Z 2025-08-14T21:41:44.3351679Z test_binary_ufuncs 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_binary_ufuncs_1.1_ba22464ca86353a9_.log 2025-08-14T21:41:44.3352373Z Running 0 items in this shard: 2025-08-14T21:41:44.3352555Z 2025-08-14T21:41:44.3358123Z Running test_matmul_cuda 1/1 ... [2025-08-14 21:41:44.335518] 2025-08-14T21:41:44.3358541Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:41:44.3362250Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_matmul_cuda.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:41:44.335875] 2025-08-14T21:41:48.9587512Z 2025-08-14T21:41:48.9588250Z test_matmul_cuda 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_matmul_cuda_1.1_f60c34556e938061_.log 2025-08-14T21:41:48.9588901Z Running 0 items in this shard: 2025-08-14T21:41:48.9589087Z 2025-08-14T21:41:48.9590631Z Running test_weak 1/1 ... [2025-08-14 21:41:48.958801] 2025-08-14T21:41:48.9591020Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:41:48.9594719Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_weak.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:41:48.959135] 2025-08-14T21:41:52.9307088Z 2025-08-14T21:41:52.9307857Z test_weak 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_weak_1.1_7f7419b6f8c53683_.log 2025-08-14T21:41:52.9308582Z Running 0 items in this shard: 2025-08-14T21:41:52.9308772Z 2025-08-14T21:41:52.9310336Z Running functorch/test_logging 1/1 ... [2025-08-14 21:41:52.930734] 2025-08-14T21:41:52.9310747Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:41:52.9313954Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_logging.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:41:52.931051] 2025-08-14T21:41:57.1031904Z 2025-08-14T21:41:57.1032894Z functorch/test_logging 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_logging_1.1_0831338a54dca243_.log 2025-08-14T21:41:57.1033612Z Running 0 items in this shard: 2025-08-14T21:41:57.1034116Z 2025-08-14T21:41:57.1035036Z Running test_logging 1/1 ... [2025-08-14 21:41:57.103258] 2025-08-14T21:41:57.1035397Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:41:57.1038882Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_logging.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:41:57.103577] 2025-08-14T21:42:00.9249770Z 2025-08-14T21:42:00.9250626Z test_logging 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_logging_1.1_acc27cb618aa62db_.log 2025-08-14T21:42:00.9251261Z Running 0 items in this shard: 2025-08-14T21:42:00.9251448Z 2025-08-14T21:42:00.9253084Z Running test_out_dtype_op 1/1 ... [2025-08-14 21:42:00.925025] 2025-08-14T21:42:00.9253472Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:42:00.9256885Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_out_dtype_op.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:42:00.925345] 2025-08-14T21:42:05.4482751Z 2025-08-14T21:42:05.4483597Z test_out_dtype_op 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_out_dtype_op_1.1_1888ec089aaec0d8_.log 2025-08-14T21:42:05.4484239Z Running 0 items in this shard: 2025-08-14T21:42:05.4484519Z 2025-08-14T21:42:05.4486462Z Running test_complex 1/1 ... [2025-08-14 21:42:05.448359] 2025-08-14T21:42:05.4486832Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:42:05.4490069Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_complex.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:42:05.448698] 2025-08-14T21:42:09.8212334Z 2025-08-14T21:42:09.8213304Z test_complex 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_complex_1.1_5282aa5f14d98d61_.log 2025-08-14T21:42:09.8213938Z Running 0 items in this shard: 2025-08-14T21:42:09.8214128Z 2025-08-14T21:42:09.8215213Z Running cpp_extensions/python_agnostic_extension/test/test_python_agnostic 1/1 ... [2025-08-14 21:42:09.821243] 2025-08-14T21:42:09.8215831Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:42:09.8219546Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'cpp_extensions/python_agnostic_extension/test/test_python_agnostic.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:42:09.821601] 2025-08-14T21:42:34.7289065Z 2025-08-14T21:42:34.7290495Z cpp_extensions/python_agnostic_extension/test/test_python_agnostic 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp_extensions.python_agnostic_extension.test.test_python_agnostic_1.1_eb61b89e110b1c28_.log 2025-08-14T21:42:34.7291668Z Running 0 items in this shard: 2025-08-14T21:42:34.7291864Z 2025-08-14T21:42:34.7292155Z Running torch_np/numpy_tests/core/test_einsum 1/1 ... [2025-08-14 21:42:34.728861] 2025-08-14T21:42:34.7292628Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:42:34.7295367Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_einsum.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:42:34.729194] 2025-08-14T21:42:38.7516092Z 2025-08-14T21:42:38.7517190Z torch_np/numpy_tests/core/test_einsum 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_einsum_1.1_65832b144ad4ec28_.log 2025-08-14T21:42:38.7518115Z Running 0 items in this shard: 2025-08-14T21:42:38.7518666Z 2025-08-14T21:42:38.7519100Z Running test_ops_jit 1/1 ... [2025-08-14 21:42:38.751674] 2025-08-14T21:42:38.7519535Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:42:38.7523565Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops_jit.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:42:38.752036] 2025-08-14T21:42:44.2763934Z 2025-08-14T21:42:44.2764932Z test_ops_jit 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_jit_1.1_7a4c4689e35378e5_.log 2025-08-14T21:42:44.2765542Z Running 0 items in this shard: 2025-08-14T21:42:44.2765730Z 2025-08-14T21:42:44.2767879Z Running lazy/test_step_closures 1/1 ... [2025-08-14 21:42:44.276494] 2025-08-14T21:42:44.2768293Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:42:44.2772082Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'lazy/test_step_closures.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:42:44.276857] 2025-08-14T21:42:48.0984173Z 2025-08-14T21:42:48.0984933Z lazy/test_step_closures 1/1 was successful, full logs can be found in artifacts with path test/test-reports/lazy.test_step_closures_1.1_707e7d67edacdfcf_.log 2025-08-14T21:42:48.0985625Z Running 0 items in this shard: 2025-08-14T21:42:48.0985806Z 2025-08-14T21:42:48.0987855Z Running test_ops_gradients 1/1 ... [2025-08-14 21:42:48.098515] 2025-08-14T21:42:48.0988263Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:42:48.0991992Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops_gradients.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:42:48.098878] 2025-08-14T21:42:54.8252083Z 2025-08-14T21:42:54.8252946Z test_ops_gradients 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_gradients_1.1_0ad2ea512b3975e7_.log 2025-08-14T21:42:54.8253627Z Running 0 items in this shard: 2025-08-14T21:42:54.8253828Z 2025-08-14T21:42:54.8255700Z Running test_fx_reinplace_pass 1/1 ... [2025-08-14 21:42:54.825279] 2025-08-14T21:42:54.8256149Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:42:54.8259589Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_fx_reinplace_pass.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:42:54.825651] 2025-08-14T21:42:58.6978737Z 2025-08-14T21:42:58.6979619Z test_fx_reinplace_pass 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_fx_reinplace_pass_1.1_65f9ccd1ae07cca1_.log 2025-08-14T21:42:58.6980316Z Running 0 items in this shard: 2025-08-14T21:42:58.6980514Z 2025-08-14T21:42:58.6983742Z Running nn/test_multihead_attention 1/1 ... [2025-08-14 21:42:58.698063] 2025-08-14T21:42:58.6984170Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:42:58.6987536Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'nn/test_multihead_attention.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:42:58.698414] 2025-08-14T21:43:03.0715274Z 2025-08-14T21:43:03.0716829Z nn/test_multihead_attention 1/1 was successful, full logs can be found in artifacts with path test/test-reports/nn.test_multihead_attention_1.1_999f13a34bd54af0_.log 2025-08-14T21:43:03.0718255Z Running 0 items in this shard: 2025-08-14T21:43:03.0718491Z 2025-08-14T21:43:03.0722594Z Running functorch/test_aot_joint_with_descriptors 1/1 ... [2025-08-14 21:43:03.071927] 2025-08-14T21:43:03.0723070Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:43:03.0726577Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_aot_joint_with_descriptors.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:43:03.072306] 2025-08-14T21:43:07.2447284Z 2025-08-14T21:43:07.2448297Z functorch/test_aot_joint_with_descriptors 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_aot_joint_with_descriptors_1.1_3e7d1f7d3736d426_.log 2025-08-14T21:43:07.2449140Z Running 0 items in this shard: 2025-08-14T21:43:07.2449320Z 2025-08-14T21:43:07.2450803Z Running test_pruning_op 1/1 ... [2025-08-14 21:43:07.244798] 2025-08-14T21:43:07.2451179Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:43:07.2455228Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_pruning_op.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:43:07.245170] 2025-08-14T21:43:11.1167581Z 2025-08-14T21:43:11.1168398Z test_pruning_op 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_pruning_op_1.1_c3e5e9c9d8e5bb02_.log 2025-08-14T21:43:11.1169049Z Running 0 items in this shard: 2025-08-14T21:43:11.1169240Z 2025-08-14T21:43:11.1171217Z Running lazy/test_functionalization 1/1 ... [2025-08-14 21:43:11.116836] 2025-08-14T21:43:11.1171651Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:43:11.1175277Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'lazy/test_functionalization.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:43:11.117193] 2025-08-14T21:43:14.9890427Z 2025-08-14T21:43:14.9891299Z lazy/test_functionalization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/lazy.test_functionalization_1.1_756de0a3be111ef5_.log 2025-08-14T21:43:14.9892057Z Running 0 items in this shard: 2025-08-14T21:43:14.9892237Z 2025-08-14T21:43:14.9893170Z Running functorch/test_ac 1/1 ... [2025-08-14 21:43:14.988980] 2025-08-14T21:43:14.9893651Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:43:14.9896622Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_ac.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:43:14.989341] 2025-08-14T21:43:21.6167067Z 2025-08-14T21:43:21.6168376Z functorch/test_ac 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_ac_1.1_a6afdf5eb339bcc1_.log 2025-08-14T21:43:21.6169405Z Running 0 items in this shard: 2025-08-14T21:43:21.6169701Z 2025-08-14T21:43:21.6171826Z Running test_ops_fwd_gradients 1/1 ... [2025-08-14 21:43:21.616857] 2025-08-14T21:43:21.6172396Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:43:21.6176495Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops_fwd_gradients.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:43:21.617294] 2025-08-14T21:43:27.6430299Z 2025-08-14T21:43:27.6431551Z test_ops_fwd_gradients 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_fwd_gradients_1.1_d177a1f08e7c3620_.log 2025-08-14T21:43:27.6432273Z Running 0 items in this shard: 2025-08-14T21:43:27.6432456Z 2025-08-14T21:43:27.6433925Z Running test_hub 1/1 ... [2025-08-14 21:43:27.643106] 2025-08-14T21:43:27.6434410Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:43:27.6438216Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_hub.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:43:27.643446] 2025-08-14T21:43:31.6173014Z 2025-08-14T21:43:31.6173811Z test_hub 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_hub_1.1_c7d0f73da0df29bb_.log 2025-08-14T21:43:31.6174476Z Running 0 items in this shard: 2025-08-14T21:43:31.6174654Z 2025-08-14T21:43:31.6176583Z Running test_set_default_mobile_cpu_allocator 1/1 ... [2025-08-14 21:43:31.617328] 2025-08-14T21:43:31.6177196Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:43:31.6180669Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_set_default_mobile_cpu_allocator.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:43:31.617656] 2025-08-14T21:43:35.4400426Z 2025-08-14T21:43:35.4401703Z test_set_default_mobile_cpu_allocator 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_set_default_mobile_cpu_allocator_1.1_b468067168c6b3e6_.log 2025-08-14T21:43:35.4403100Z Running 0 items in this shard: 2025-08-14T21:43:35.4403396Z 2025-08-14T21:43:35.4403717Z Running test_cuda_expandable_segments 1/1 ... [2025-08-14 21:43:35.440121] 2025-08-14T21:43:35.4404431Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:43:35.4409311Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_cuda_expandable_segments.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:43:35.440569] 2025-08-14T21:43:49.9848245Z 2025-08-14T21:43:49.9849430Z test_cuda_expandable_segments 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_cuda_expandable_segments_1.1_dd97eb7e0b6f2299_.log 2025-08-14T21:43:49.9857784Z Running 22 items in this shard: test/test_cuda_expandable_segments.py::TestCuda::test_cuda_kernel_loop_overflow, test/test_cuda_expandable_segments.py::TestCuda::test_cuda_kernel_loop_overflow_large, test/test_cuda_expandable_segments.py::TestCuda::test_get_per_process_memory_fraction, test/test_cuda_expandable_segments.py::TestCuda::test_graph_cudnn_dropout, test/test_cuda_expandable_segments.py::TestCuda::test_graph_make_graphed_callables_parameterless_nograd_module_with_amp_cache_disabled_allow_unused_input, test/test_cuda_expandable_segments.py::TestCuda::test_graph_make_graphed_callables_parameterless_nograd_module_with_amp_cache_enabled_allow_unused_input, test/test_cuda_expandable_segments.py::TestCuda::test_graph_make_graphed_callables_parameterless_nograd_module_without_amp_allow_unused_input, test/test_cuda_expandable_segments.py::TestCuda::test_graph_make_graphed_callables_parameterless_nograd_module_without_amp_not_allow_unused_input, test/test_cuda_expandable_segments.py::TestCuda::test_graph_make_graphed_callables_with_amp_cache_disabled_allow_unused_input, test/test_cuda_expandable_segments.py::TestCuda::test_graph_make_graphed_callables_with_amp_cache_enabled_allow_unused_input, test/test_cuda_expandable_segments.py::TestCuda::test_graph_make_graphed_callables_without_amp_allow_unused_input, test/test_cuda_expandable_segments.py::TestCuda::test_graph_make_graphed_callables_without_amp_not_allow_unused_input, test/test_cuda_expandable_segments.py::TestCuda::test_host_memory_stats, test/test_cuda_expandable_segments.py::TestCuda::test_huge_index, test/test_cuda_expandable_segments.py::TestCuda::test_max_large_axis, test/test_cuda_expandable_segments.py::TestCuda::test_out_of_memory_retry, test/test_cuda_expandable_segments.py::TestCuda::test_randint_generation_for_large_numel, test/test_cuda_expandable_segments.py::TestCuda::test_repeat_graph_capture_cublas_workspace_memory, test/test_cuda_expandable_segments.py::TestCuda::test_set_per_process_memory_fraction, test/test_cuda_expandable_segments.py::TestCuda::test_to_non_blocking, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_garbage_collect_expandable, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_max_split_expandable 2025-08-14T21:43:49.9866040Z 2025-08-14T21:43:49.9866219Z Running test_stateless 1/1 ... [2025-08-14 21:43:49.984908] 2025-08-14T21:43:49.9866585Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:43:49.9867538Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_stateless.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:43:49.985274] 2025-08-14T21:43:54.1575734Z 2025-08-14T21:43:54.1576935Z test_stateless 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_stateless_1.1_1485ac0af2e6e52c_.log 2025-08-14T21:43:54.1578477Z Running 0 items in this shard: 2025-08-14T21:43:54.1578768Z 2025-08-14T21:43:54.1581735Z Running test_numpy_interop 1/1 ... [2025-08-14 21:43:54.157877] 2025-08-14T21:43:54.1582283Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:43:54.1586044Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_numpy_interop.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:43:54.158256] 2025-08-14T21:43:58.5314021Z 2025-08-14T21:43:58.5314735Z test_numpy_interop 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_numpy_interop_1.1_2b9cd577bcd7d515_.log 2025-08-14T21:43:58.5315395Z Running 0 items in this shard: 2025-08-14T21:43:58.5315573Z 2025-08-14T21:43:58.5320407Z Running profiler/test_record_function 1/1 ... [2025-08-14 21:43:58.531766] 2025-08-14T21:43:58.5320827Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:43:58.5324213Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_record_function.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:43:58.532103] 2025-08-14T21:44:02.3540720Z 2025-08-14T21:44:02.3541723Z profiler/test_record_function 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_record_function_1.1_8e6c6182bc0475e9_.log 2025-08-14T21:44:02.3542489Z Running 0 items in this shard: 2025-08-14T21:44:02.3542691Z 2025-08-14T21:44:02.3543368Z Running test_type_hints 1/1 ... [2025-08-14 21:44:02.354093] 2025-08-14T21:44:02.3543781Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:44:02.3547539Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_type_hints.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:44:02.354425] 2025-08-14T21:44:06.3261564Z 2025-08-14T21:44:06.3262626Z test_type_hints 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_type_hints_1.1_9c65717c87e2b157_.log 2025-08-14T21:44:06.3263284Z Running 0 items in this shard: 2025-08-14T21:44:06.3263473Z 2025-08-14T21:44:06.3265159Z Running nn/test_lazy_modules 1/1 ... [2025-08-14 21:44:06.326231] 2025-08-14T21:44:06.3265550Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:44:06.3269084Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'nn/test_lazy_modules.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:44:06.326586] 2025-08-14T21:44:10.4499617Z 2025-08-14T21:44:10.4500398Z nn/test_lazy_modules 1/1 was successful, full logs can be found in artifacts with path test/test-reports/nn.test_lazy_modules_1.1_c1fc0e0d51b9b320_.log 2025-08-14T21:44:10.4501083Z Running 0 items in this shard: 2025-08-14T21:44:10.4501580Z 2025-08-14T21:44:10.4502987Z Running test_segment_reductions 1/1 ... [2025-08-14 21:44:10.450019] 2025-08-14T21:44:10.4503395Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:44:10.4506823Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_segment_reductions.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:44:10.450352] 2025-08-14T21:44:14.8227005Z 2025-08-14T21:44:14.8227936Z test_segment_reductions 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_segment_reductions_1.1_1bccc52652b6229f_.log 2025-08-14T21:44:14.8228640Z Running 0 items in this shard: 2025-08-14T21:44:14.8228824Z 2025-08-14T21:44:14.8230602Z Running test_import_stats 1/1 ... [2025-08-14 21:44:14.822726] 2025-08-14T21:44:14.8231411Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:44:14.8233916Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_import_stats.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:44:14.823070] 2025-08-14T21:44:18.6448841Z 2025-08-14T21:44:18.6449808Z test_import_stats 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_import_stats_1.1_f0a6a41b5537e1c2_.log 2025-08-14T21:44:18.6450483Z Running 0 items in this shard: 2025-08-14T21:44:18.6450666Z 2025-08-14T21:44:18.6452001Z Running test_masked 1/1 ... [2025-08-14 21:44:18.644909] 2025-08-14T21:44:18.6452363Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:44:18.6455683Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_masked.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:44:18.645240] 2025-08-14T21:44:24.0214156Z 2025-08-14T21:44:24.0215018Z test_masked 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_masked_1.1_684ca6949ae8b806_.log 2025-08-14T21:44:24.0215755Z Running 0 items in this shard: 2025-08-14T21:44:24.0215935Z 2025-08-14T21:44:24.0216215Z Running test_cuda_sanitizer 1/1 ... [2025-08-14 21:44:24.021372] 2025-08-14T21:44:24.0216636Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:44:24.0220455Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_cuda_sanitizer.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:44:24.021720] 2025-08-14T21:44:27.9934705Z 2025-08-14T21:44:27.9935966Z test_cuda_sanitizer 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_cuda_sanitizer_1.1_62790a170bba1501_.log 2025-08-14T21:44:27.9936775Z Running 0 items in this shard: 2025-08-14T21:44:27.9937014Z 2025-08-14T21:44:27.9937269Z Running test_monitor 1/1 ... [2025-08-14 21:44:27.993438] 2025-08-14T21:44:27.9937637Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:44:27.9940772Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_monitor.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:44:27.993761] 2025-08-14T21:44:31.8155063Z 2025-08-14T21:44:31.8155834Z test_monitor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_monitor_1.1_65b571211c6349bc_.log 2025-08-14T21:44:31.8156460Z Running 0 items in this shard: 2025-08-14T21:44:31.8156636Z 2025-08-14T21:44:31.8158169Z Running profiler/test_cpp_thread 1/1 ... [2025-08-14 21:44:31.815499] 2025-08-14T21:44:31.8158650Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:44:31.8161569Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_cpp_thread.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:44:31.815825] 2025-08-14T21:44:37.3905463Z 2025-08-14T21:44:37.3906286Z profiler/test_cpp_thread 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_cpp_thread_1.1_dfcd5c6b4128e72b_.log 2025-08-14T21:44:37.3907003Z Running 0 items in this shard: 2025-08-14T21:44:37.3907187Z 2025-08-14T21:44:37.3908022Z Running test_subclass 1/1 ... [2025-08-14 21:44:37.390525] 2025-08-14T21:44:37.3908949Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:44:37.3912611Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_subclass.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:44:37.390864] 2025-08-14T21:44:41.3629479Z 2025-08-14T21:44:41.3630509Z test_subclass 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_subclass_1.1_06cbb05cd8ef5ca9_.log 2025-08-14T21:44:41.3631152Z Running 0 items in this shard: 2025-08-14T21:44:41.3631335Z 2025-08-14T21:44:41.3633081Z Running profiler/test_profiler 1/1 ... [2025-08-14 21:44:41.363002] 2025-08-14T21:44:41.3633478Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:44:41.3636511Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_profiler.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:44:41.363343] 2025-08-14T21:44:45.6359916Z 2025-08-14T21:44:45.6360968Z profiler/test_profiler 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_profiler_1.1_2fece71adee404bc_.log 2025-08-14T21:44:45.6366040Z Running 10 items in this shard: test/profiler/test_profiler.py::TestProfiler::test_source_multithreaded_basic_work_in_main_thread_False, test/profiler/test_profiler.py::TestProfiler::test_source_multithreaded_basic_work_in_main_thread_True, test/profiler/test_profiler.py::TestProfiler::test_source_multithreaded_close_in_scope_work_in_main_thread_False, test/profiler/test_profiler.py::TestProfiler::test_source_multithreaded_close_in_scope_work_in_main_thread_True, test/profiler/test_profiler.py::TestProfiler::test_source_multithreaded_complex_work_in_main_thread_False, test/profiler/test_profiler.py::TestProfiler::test_source_multithreaded_complex_work_in_main_thread_True, test/profiler/test_profiler.py::TestProfiler::test_source_multithreaded_multiple_preexisting_work_in_main_thread_False, test/profiler/test_profiler.py::TestProfiler::test_source_multithreaded_multiple_preexisting_work_in_main_thread_True, test/profiler/test_profiler.py::TestProfiler::test_source_multithreaded_open_in_scope_work_in_main_thread_False, test/profiler/test_profiler.py::TestProfiler::test_source_multithreaded_open_in_scope_work_in_main_thread_True 2025-08-14T21:44:45.6370232Z 2025-08-14T21:44:45.6370470Z Running functorch/test_vmap_registrations 1/1 ... [2025-08-14 21:44:45.636073] 2025-08-14T21:44:45.6370903Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:44:45.6371963Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_vmap_registrations.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:44:45.636470] 2025-08-14T21:44:49.8087972Z 2025-08-14T21:44:49.8088947Z functorch/test_vmap_registrations 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_vmap_registrations_1.1_80a022d446ccf29d_.log 2025-08-14T21:44:49.8089759Z Running 0 items in this shard: 2025-08-14T21:44:49.8089953Z 2025-08-14T21:44:49.8091873Z Running nn/test_parametrization 1/1 ... [2025-08-14 21:44:49.808901] 2025-08-14T21:44:49.8092280Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:44:49.8095865Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'nn/test_parametrization.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:44:49.809269] 2025-08-14T21:44:54.1814573Z 2025-08-14T21:44:54.1815628Z nn/test_parametrization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/nn.test_parametrization_1.1_84bd565bf937ddee_.log 2025-08-14T21:44:54.1816350Z Running 0 items in this shard: 2025-08-14T21:44:54.1816541Z 2025-08-14T21:44:54.1820132Z Running nn/test_pruning 1/1 ... [2025-08-14 21:44:54.181665] 2025-08-14T21:44:54.1820516Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:44:54.1824689Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'nn/test_pruning.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:44:54.182054] 2025-08-14T21:44:58.3540604Z 2025-08-14T21:44:58.3541213Z nn/test_pruning 1/1 was successful, full logs can be found in artifacts with path test/test-reports/nn.test_pruning_1.1_3201999c0c2b5294_.log 2025-08-14T21:44:58.3541848Z Running 0 items in this shard: 2025-08-14T21:44:58.3544918Z 2025-08-14T21:44:58.3545112Z Running functorch/test_dims 1/1 ... [2025-08-14 21:44:58.354167] 2025-08-14T21:44:58.3545507Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:44:58.3548898Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_dims.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:44:58.354552] 2025-08-14T21:45:02.4762097Z 2025-08-14T21:45:02.4763014Z functorch/test_dims 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_dims_1.1_f4585f7fb9c7e885_.log 2025-08-14T21:45:02.4763689Z Running 0 items in this shard: 2025-08-14T21:45:02.4763866Z 2025-08-14T21:45:02.4765964Z Running test_itt 1/1 ... [2025-08-14 21:45:02.476308] 2025-08-14T21:45:02.4766313Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:45:02.4769823Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_itt.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:45:02.476672] 2025-08-14T21:45:06.2982212Z 2025-08-14T21:45:06.2983227Z test_itt 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_itt_1.1_f246171f6883ebb7_.log 2025-08-14T21:45:06.2983876Z Running 0 items in this shard: 2025-08-14T21:45:06.2985135Z 2025-08-14T21:45:06.2985505Z Running torch_np/numpy_tests/core/test_multiarray 1/2 ... [2025-08-14 21:45:06.298127] 2025-08-14T21:45:06.2986167Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:45:06.2988978Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_multiarray.py', '-m', 'serial', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:45:06.298540] 2025-08-14T21:45:10.4702246Z 2025-08-14T21:45:10.4703423Z torch_np/numpy_tests/core/test_multiarray 1/2 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_multiarray_1.2_2e05f2b6d13b8a81_.log 2025-08-14T21:45:10.4704334Z Running 0 items in this shard: 2025-08-14T21:45:10.4704551Z 2025-08-14T21:45:10.4706598Z Running functorch/test_aotdispatch 1/1 ... [2025-08-14 21:45:10.470378] 2025-08-14T21:45:10.4707207Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:45:10.4710688Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_aotdispatch.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:45:10.470734] 2025-08-14T21:45:16.7459766Z 2025-08-14T21:45:16.7460724Z functorch/test_aotdispatch 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_aotdispatch_1.1_5cea7e9107e60ff5_.log 2025-08-14T21:45:16.7461473Z Running 0 items in this shard: 2025-08-14T21:45:16.7461654Z 2025-08-14T21:45:16.7463377Z Running test_schema_check 1/1 ... [2025-08-14 21:45:16.746069] 2025-08-14T21:45:16.7465951Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:45:16.7467550Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_schema_check.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:45:16.746407] 2025-08-14T21:45:23.3725714Z 2025-08-14T21:45:23.3726487Z test_schema_check 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_schema_check_1.1_a31b39d6f01ff593_.log 2025-08-14T21:45:23.3727158Z Running 0 items in this shard: 2025-08-14T21:45:23.3727341Z 2025-08-14T21:45:23.3729474Z Running test_datapipe 1/1 ... [2025-08-14 21:45:23.372690] 2025-08-14T21:45:23.3729843Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:45:23.3733194Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_datapipe.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:45:23.373036] 2025-08-14T21:45:27.3947876Z 2025-08-14T21:45:27.3948616Z test_datapipe 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_datapipe_1.1_400e526d20818677_.log 2025-08-14T21:45:27.3949287Z Running 0 items in this shard: 2025-08-14T21:45:27.3949470Z 2025-08-14T21:45:27.3951675Z Running test_bundled_inputs 1/1 ... [2025-08-14 21:45:27.394859] 2025-08-14T21:45:27.3952060Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:45:27.3955666Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_bundled_inputs.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:45:27.395244] 2025-08-14T21:45:31.2667886Z 2025-08-14T21:45:31.2668693Z test_bundled_inputs 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_bundled_inputs_1.1_06597f0886aaeaf0_.log 2025-08-14T21:45:31.2669609Z Running 0 items in this shard: 2025-08-14T21:45:31.2669820Z 2025-08-14T21:45:31.2671669Z Running profiler/test_execution_trace 1/1 ... [2025-08-14 21:45:31.266882] 2025-08-14T21:45:31.2672109Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:45:31.2682964Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_execution_trace.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:45:31.267224] 2025-08-14T21:45:35.6394644Z 2025-08-14T21:45:35.6396672Z profiler/test_execution_trace 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_execution_trace_1.1_9a7e7c6ddf7adba9_.log 2025-08-14T21:45:35.6398121Z Running 0 items in this shard: 2025-08-14T21:45:35.6398314Z 2025-08-14T21:45:35.6398508Z Running lazy/test_generator 1/1 ... [2025-08-14 21:45:35.639488] 2025-08-14T21:45:35.6398883Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:45:35.6401711Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'lazy/test_generator.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:45:35.639854] 2025-08-14T21:45:39.5111584Z 2025-08-14T21:45:39.5112379Z lazy/test_generator 1/1 was successful, full logs can be found in artifacts with path test/test-reports/lazy.test_generator_1.1_a9c94ba676cba13c_.log 2025-08-14T21:45:39.5113056Z Running 0 items in this shard: 2025-08-14T21:45:39.5113249Z 2025-08-14T21:45:39.5115667Z Running torch_np/numpy_tests/core/test_indexing 1/1 ... [2025-08-14 21:45:39.511236] 2025-08-14T21:45:39.5116215Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:45:39.5119700Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_indexing.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:45:39.511621] 2025-08-14T21:45:43.4333133Z 2025-08-14T21:45:43.4334181Z torch_np/numpy_tests/core/test_indexing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_indexing_1.1_a00be2fe555d7ac0_.log 2025-08-14T21:45:43.4335056Z Running 0 items in this shard: 2025-08-14T21:45:43.4335245Z 2025-08-14T21:45:43.4336806Z Running test_maskedtensor 1/1 ... [2025-08-14 21:45:43.433360] 2025-08-14T21:45:43.4337197Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:45:43.4340541Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_maskedtensor.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:45:43.433724] 2025-08-14T21:45:48.9579947Z 2025-08-14T21:45:48.9580668Z test_maskedtensor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_maskedtensor_1.1_5ab0512779c9742d_.log 2025-08-14T21:45:48.9581467Z Running 0 items in this shard: 2025-08-14T21:45:48.9581718Z 2025-08-14T21:45:48.9583143Z Running test_tensorboard 1/1 ... [2025-08-14 21:45:48.958009] 2025-08-14T21:45:48.9583534Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:45:48.9586897Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_tensorboard.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:45:48.958375] 2025-08-14T21:45:53.2805201Z 2025-08-14T21:45:53.2806251Z test_tensorboard 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_tensorboard_1.1_5a3038c31f43a006_.log 2025-08-14T21:45:53.2806900Z Running 0 items in this shard: 2025-08-14T21:45:53.2807091Z 2025-08-14T21:45:53.2809047Z Running torch_np/numpy_tests/lib/test_type_check 1/1 ... [2025-08-14 21:45:53.280597] 2025-08-14T21:45:53.2809679Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:45:53.2813157Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/lib/test_type_check.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:45:53.280958] 2025-08-14T21:45:57.1521169Z 2025-08-14T21:45:57.1522324Z torch_np/numpy_tests/lib/test_type_check 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.lib.test_type_check_1.1_6ae81ac222450123_.log 2025-08-14T21:45:57.1523160Z Running 0 items in this shard: 2025-08-14T21:45:57.1523347Z 2025-08-14T21:45:57.1525437Z Running lazy/test_ts_opinfo 1/1 ... [2025-08-14 21:45:57.152201] 2025-08-14T21:45:57.1525971Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:45:57.1530098Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'lazy/test_ts_opinfo.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:45:57.152584] 2025-08-14T21:46:02.3264365Z 2025-08-14T21:46:02.3265114Z lazy/test_ts_opinfo 1/1 was successful, full logs can be found in artifacts with path test/test-reports/lazy.test_ts_opinfo_1.1_97dc335a0cbefc26_.log 2025-08-14T21:46:02.3265901Z Running 0 items in this shard: 2025-08-14T21:46:02.3266104Z 2025-08-14T21:46:02.3268167Z Running test_mkldnn_fusion 1/1 ... [2025-08-14 21:46:02.326554] 2025-08-14T21:46:02.3268559Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:46:02.3272103Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_mkldnn_fusion.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:46:02.326897] 2025-08-14T21:46:06.2983784Z 2025-08-14T21:46:06.2984804Z test_mkldnn_fusion 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_mkldnn_fusion_1.1_afce9695227f85d8_.log 2025-08-14T21:46:06.2985514Z Running 0 items in this shard: 2025-08-14T21:46:06.2985697Z 2025-08-14T21:46:06.2987670Z Running benchmark_utils/test_benchmark_utils 1/1 ... [2025-08-14 21:46:06.298480] 2025-08-14T21:46:06.2988127Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:46:06.2991565Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'benchmark_utils/test_benchmark_utils.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:46:06.298839] 2025-08-14T21:46:10.1706423Z 2025-08-14T21:46:10.1707435Z benchmark_utils/test_benchmark_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/benchmark_utils.test_benchmark_utils_1.1_8c41a580c2bf2712_.log 2025-08-14T21:46:10.1708375Z Running 0 items in this shard: 2025-08-14T21:46:10.1708610Z 2025-08-14T21:46:10.1710391Z Running optim/test_swa_utils 1/1 ... [2025-08-14 21:46:10.170669] 2025-08-14T21:46:10.1710794Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:46:10.1713493Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'optim/test_swa_utils.py', '-m', 'serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:46:10.171018] 2025-08-14T21:46:13.9421600Z 2025-08-14T21:46:13.9422518Z optim/test_swa_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/optim.test_swa_utils_1.1_27c063586cf40236_.log 2025-08-14T21:46:13.9423109Z 2025-08-14T21:46:13.9425239Z Running test_sparse_csr 2/2 ... [2025-08-14 21:46:13.942240] 2025-08-14T21:46:13.9425620Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:46:13.9429026Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_sparse_csr.py', '-m', 'serial', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:46:13.942580] 2025-08-14T21:46:20.5692384Z 2025-08-14T21:46:20.5693138Z test_sparse_csr 2/2 was successful, full logs can be found in artifacts with path test/test-reports/test_sparse_csr_2.2_70d5fc485050275a_.log 2025-08-14T21:46:20.5694850Z Running 0 items in this shard: 2025-08-14T21:46:20.5695247Z 2025-08-14T21:46:23.8784358Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T21:46:23.8788120Z import pkg_resources 2025-08-14T21:46:23.8832069Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T21:46:23.8833716Z import pkg_resources 2025-08-14T21:46:23.8834936Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T21:46:23.8836184Z import pkg_resources 2025-08-14T21:46:24.2542322Z Running export/test_torchbind 1/1 ... [2025-08-14 21:46:24.253688] 2025-08-14T21:46:24.2542803Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:46:24.2545717Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_torchbind.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:46:24.254070] 2025-08-14T21:46:24.2683839Z Running test_testing 1/1 ... [2025-08-14 21:46:24.267950] 2025-08-14T21:46:24.2684242Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:46:24.2684701Z Running inductor/test_aot_inductor 1/1 ... [2025-08-14 21:46:24.268020] 2025-08-14T21:46:24.2685174Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:46:24.2686967Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_testing.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:46:24.268345] 2025-08-14T21:46:24.2688751Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:46:24.268397] 2025-08-14T21:46:43.0708304Z 2025-08-14T21:46:43.0709289Z test_testing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_testing_1.1_a4c760f0195db170_.log 2025-08-14T21:46:43.1512525Z Running 2072 items in this shard: test/test_testing.py::TestTestingCUDA::test_assertEqual_longMessage_cuda, test/test_testing.py::TestTestingCUDA::test_assertEqual_numpy_cuda_bool, test/test_testing.py::TestTestingCUDA::test_assertEqual_numpy_cuda_complex128, test/test_testing.py::TestTestingCUDA::test_assertEqual_numpy_cuda_complex64, test/test_testing.py::TestTestingCUDA::test_assertEqual_numpy_cuda_float16, test/test_testing.py::TestTestingCUDA::test_assertEqual_numpy_cuda_float32, test/test_testing.py::TestTestingCUDA::test_assertEqual_numpy_cuda_float64, test/test_testing.py::TestTestingCUDA::test_assertEqual_numpy_cuda_int16, test/test_testing.py::TestTestingCUDA::test_assertEqual_numpy_cuda_int32, test/test_testing.py::TestTestingCUDA::test_assertEqual_numpy_cuda_int64, test/test_testing.py::TestTestingCUDA::test_assertEqual_numpy_cuda_int8, test/test_testing.py::TestTestingCUDA::test_assertEqual_numpy_cuda_uint8, test/test_testing.py::TestTestingCUDA::test_cuda_assert_should_not_stop_common_distributed_test_suite_cuda, test/test_testing.py::TestTestingCUDA::test_cuda_assert_should_stop_common_device_type_test_suite_cuda, test/test_testing.py::TestTestingCUDA::test_cuda_assert_should_stop_common_utils_test_suite_cuda, test/test_testing.py::TestTestingCUDA::test_get_supported_dtypes_cuda, test/test_testing.py::TestTestingCUDA::test_isclose_atol_rtol_greater_than_zero_cuda_bool, test/test_testing.py::TestTestingCUDA::test_isclose_atol_rtol_greater_than_zero_cuda_float16, test/test_testing.py::TestTestingCUDA::test_isclose_atol_rtol_greater_than_zero_cuda_float32, test/test_testing.py::TestTestingCUDA::test_isclose_atol_rtol_greater_than_zero_cuda_float64, test/test_testing.py::TestTestingCUDA::test_isclose_atol_rtol_greater_than_zero_cuda_int16, test/test_testing.py::TestTestingCUDA::test_isclose_atol_rtol_greater_than_zero_cuda_int32, test/test_testing.py::TestTestingCUDA::test_isclose_atol_rtol_greater_than_zero_cuda_int64, test/test_testing.py::TestTestingCUDA::test_isclose_atol_rtol_greater_than_zero_cuda_int8, test/test_testing.py::TestTestingCUDA::test_isclose_atol_rtol_greater_than_zero_cuda_uint8, test/test_testing.py::TestTestingCUDA::test_isclose_bool_cuda, test/test_testing.py::TestTestingCUDA::test_isclose_complex_cuda_complex128, test/test_testing.py::TestTestingCUDA::test_isclose_complex_cuda_complex64, test/test_testing.py::TestTestingCUDA::test_isclose_equality_shortcut_cuda, test/test_testing.py::TestTestingCUDA::test_isclose_float_cuda_float16, test/test_testing.py::TestTestingCUDA::test_isclose_float_cuda_float32, test/test_testing.py::TestTestingCUDA::test_isclose_float_cuda_float64, test/test_testing.py::TestTestingCUDA::test_isclose_integer_cuda_int16, test/test_testing.py::TestTestingCUDA::test_isclose_integer_cuda_int32, test/test_testing.py::TestTestingCUDA::test_isclose_integer_cuda_int64, test/test_testing.py::TestTestingCUDA::test_isclose_integer_cuda_int8, test/test_testing.py::TestTestingCUDA::test_isclose_integer_cuda_uint8, test/test_testing.py::TestTestingCUDA::test_isclose_nan_equality_shortcut_cuda_complex128, test/test_testing.py::TestTestingCUDA::test_isclose_nan_equality_shortcut_cuda_complex64, test/test_testing.py::TestTestingCUDA::test_isclose_nan_equality_shortcut_cuda_float16, test/test_testing.py::TestTestingCUDA::test_isclose_nan_equality_shortcut_cuda_float32, test/test_testing.py::TestTestingCUDA::test_isclose_nan_equality_shortcut_cuda_float64, test/test_testing.py::TestTestingCUDA::test_setup_and_teardown_run_for_device_specific_tests_cuda, test/test_testing.py::TestTestingCUDA::test_supported_dtypes_abs_cuda, test/test_testing.py::TestFrameworkUtils::test_filtering_env_var, test/test_testing.py::TestAssertClose::test_bool, test/test_testing.py::TestAssertClose::test_default_tolerance_selection_mismatching_dtypes, test/test_testing.py::TestAssertClose::test_docstring_examples, test/test_testing.py::TestAssertClose::test_matching, test/test_testing.py::TestAssertClose::test_matching_atol, test/test_testing.py::TestAssertClose::test_matching_conjugate_bit, test/test_testing.py::TestAssertClose::test_matching_nan, test/test_testing.py::TestAssertClose::test_matching_nan_with_equal_nan, test/test_testing.py::TestAssertClose::test_matching_rtol, test/test_testing.py::TestAssertClose::test_meta, test/test_testing.py::TestAssertClose::test_mismatching_dtype, test/test_testing.py::TestAssertClose::test_mismatching_dtype_no_check, test/test_testing.py::TestAssertClose::test_mismatching_layout, test/test_testing.py::TestAssertClose::test_mismatching_layout_no_check, test/test_testing.py::TestAssertClose::test_mismatching_shape, test/test_testing.py::TestAssertClose::test_mismatching_stride, test/test_testing.py::TestAssertClose::test_mismatching_stride_no_check, test/test_testing.py::TestAssertClose::test_mismatching_types, test/test_testing.py::TestAssertClose::test_mismatching_types_subclasses, test/test_testing.py::TestAssertClose::test_mismatching_types_type_equality, test/test_testing.py::TestAssertClose::test_mismatching_values, test/test_testing.py::TestAssertClose::test_mismatching_values_atol, test/test_testing.py::TestAssertClose::test_mismatching_values_rtol, test/test_testing.py::TestAssertClose::test_none, test/test_testing.py::TestAssertClose::test_none_mismatch, test/test_testing.py::TestAssertClose::test_numpy, test/test_testing.py::TestAssertClose::test_only_atol, test/test_testing.py::TestAssertClose::test_only_rtol, test/test_testing.py::TestAssertClose::test_scalar, test/test_testing.py::TestAssertClose::test_unexpected_error_compare, test/test_testing.py::TestAssertClose::test_unexpected_error_originate, test/test_testing.py::TestAssertClose::test_unknown_layout, test/test_testing.py::TestAssertClose::test_unknown_type, test/test_testing.py::TestAssertCloseMultiDeviceCUDA::test_mismatching_device_cuda, test/test_testing.py::TestAssertCloseMultiDeviceCUDA::test_mismatching_device_no_check_cuda, test/test_testing.py::TestAssertCloseErrorMessage::test_abs_diff, test/test_testing.py::TestAssertCloseErrorMessage::test_abs_diff_scalar, test/test_testing.py::TestAssertCloseErrorMessage::test_atol, test/test_testing.py::TestAssertCloseErrorMessage::test_identifier_scalars, test/test_testing.py::TestAssertCloseErrorMessage::test_identifier_tensor_likes, test/test_testing.py::TestAssertCloseErrorMessage::test_mismatched_elements, test/test_testing.py::TestAssertCloseErrorMessage::test_msg_callable, test/test_testing.py::TestAssertCloseErrorMessage::test_msg_str, test/test_testing.py::TestAssertCloseErrorMessage::test_not_close, test/test_testing.py::TestAssertCloseErrorMessage::test_not_equal, test/test_testing.py::TestAssertCloseErrorMessage::test_rel_diff, test/test_testing.py::TestAssertCloseErrorMessage::test_rel_diff_scalar, test/test_testing.py::TestAssertCloseErrorMessage::test_rtol, test/test_testing.py::TestAssertCloseErrorMessage::test_small_float_dtype, test/test_testing.py::TestAssertCloseErrorMessage::test_zero_div_zero, test/test_testing.py::TestAssertCloseContainer::test_mapping_mismatching_keys, test/test_testing.py::TestAssertCloseContainer::test_mapping_mismatching_values_msg, test/test_testing.py::TestAssertCloseContainer::test_sequence_mismatching_len, test/test_testing.py::TestAssertCloseContainer::test_sequence_mismatching_values_msg, test/test_testing.py::TestAssertCloseSparseCOO::test_matching_coalesced, test/test_testing.py::TestAssertCloseSparseCOO::test_matching_uncoalesced, test/test_testing.py::TestAssertCloseSparseCOO::test_mismatching_indices_msg, test/test_testing.py::TestAssertCloseSparseCOO::test_mismatching_nnz, test/test_testing.py::TestAssertCloseSparseCOO::test_mismatching_sparse_dims, test/test_testing.py::TestAssertCloseSparseCOO::test_mismatching_values_msg, test/test_testing.py::TestAssertCloseSparseCSR::test_matching, test/test_testing.py::TestAssertCloseSparseCSR::test_mismatching_col_indices_msg, test/test_testing.py::TestAssertCloseSparseCSR::test_mismatching_crow_indices_msg, test/test_testing.py::TestAssertCloseSparseCSR::test_mismatching_values_msg, test/test_testing.py::TestAssertCloseSparseCSC::test_matching, test/test_testing.py::TestAssertCloseSparseCSC::test_mismatching_ccol_indices_msg, test/test_testing.py::TestAssertCloseSparseCSC::test_mismatching_row_indices_msg, test/test_testing.py::TestAssertCloseSparseCSC::test_mismatching_values_msg, test/test_testing.py::TestAssertCloseSparseBSR::test_matching, test/test_testing.py::TestAssertCloseSparseBSR::test_mismatching_col_indices_msg, test/test_testing.py::TestAssertCloseSparseBSR::test_mismatching_crow_indices_msg, test/test_testing.py::TestAssertCloseSparseBSR::test_mismatching_values_msg, test/test_testing.py::TestAssertCloseSparseBSC::test_matching, test/test_testing.py::TestAssertCloseSparseBSC::test_mismatching_ccol_indices_msg, test/test_testing.py::TestAssertCloseSparseBSC::test_mismatching_row_indices_msg, test/test_testing.py::TestAssertCloseSparseBSC::test_mismatching_values_msg, test/test_testing.py::TestAssertCloseQuantized::test_matching_per_channel, test/test_testing.py::TestAssertCloseQuantized::test_matching_per_tensor, test/test_testing.py::TestAssertCloseQuantized::test_mismatching_is_quantized, test/test_testing.py::TestAssertCloseQuantized::test_mismatching_qscheme, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_exclude_zero_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types0_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types1_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types2_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high0_value_types3_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types0_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types1_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types2_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high1_value_types3_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types0_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types1_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types2_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_ge_high_low_high2_value_types3_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral1_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral1_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral1_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral1_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral1_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral1_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral2_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral2_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral2_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral2_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral2_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_boolean_integral2_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_default_smoke_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high0_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high1_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_nan_low_high2_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_outside_valid_range_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_low_high_smoke_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape0_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape1_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape2_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape3_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_memory_format_memory_format_and_shape4_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_memory_format_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape0_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape1_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape2_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape3_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape4_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape5_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_False_shape6_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape0_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape1_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape2_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape3_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape4_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape5_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_noncontiguous_noncontiguous_True_shape6_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_False_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_requires_grad_requires_grad_True_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_False_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape0_splat_shape_True_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_False_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape1_splat_shape_True_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_False_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape2_splat_shape_True_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_False_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape3_splat_shape_True_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_False_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape4_splat_shape_True_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_False_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape5_splat_shape_True_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_False_cuda_uint8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_bfloat16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_bool, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_complex128, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_complex32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_complex64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_float16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_float32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_float64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_int16, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_int32, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_int64, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_int8, test/test_testing.py::TestMakeTensorCUDA::test_smoke_shape6_splat_shape_True_cuda_uint8, test/test_testing.py::TestTestParametrization::test_apply_param_specific_decorators, test/test_testing.py::TestTestParametrization::test_compose_param_specific_decorators, test/test_testing.py::TestTestParametrization::test_default_names, test/test_testing.py::TestTestParametrization::test_modules_decorator_misuse_error, test/test_testing.py::TestTestParametrization::test_multiple_handling_of_same_param_error, test/test_testing.py::TestTestParametrization::test_name_fn, test/test_testing.py::TestTestParametrization::test_ops_decorator_misuse_error, test/test_testing.py::TestTestParametrization::test_reparametrize, test/test_testing.py::TestTestParametrization::test_subtest_expected_failure_x_1, test/test_testing.py::TestTestParametrization::test_subtest_expected_failure_x_2, test/test_testing.py::TestTestParametrization::test_subtest_expected_failure_x_3, test/test_testing.py::TestTestParametrization::test_subtest_names, test/test_testing.py::TestTestParametrization::test_two_things_subtest_expected_failure_x_1_y_4, test/test_testing.py::TestTestParametrization::test_two_things_subtest_expected_failure_x_1_y_5, test/test_testing.py::TestTestParametrization::test_two_things_subtest_expected_failure_x_1_y_6, test/test_testing.py::TestTestParametrization::test_two_things_subtest_expected_failure_x_2_y_4, test/test_testing.py::TestTestParametrization::test_two_things_subtest_expected_failure_x_2_y_5, test/test_testing.py::TestTestParametrization::test_two_things_subtest_expected_failure_x_2_y_6, test/test_testing.py::TestTestParametrization::test_two_things_subtest_expected_failure_x_3_y_4, test/test_testing.py::TestTestParametrization::test_two_things_subtest_expected_failure_x_3_y_5, test/test_testing.py::TestTestParametrization::test_two_things_subtest_expected_failure_x_3_y_6, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_default_name_non_primitive_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_default_names_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_dtypes_composition_invalid_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_dtypes_composition_valid_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_empty_param_list_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_empty_param_names_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_modules_composition_names_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_modules_decorator_applies_module_and_param_specific_decorators_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_multiple_handling_of_same_param_error_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_name_fn_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_ops_composition_names_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_ops_decorator_applies_op_and_param_specific_decorators_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_param_specific_decoration_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_subtest_expected_failure_x_1_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_subtest_expected_failure_x_2_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_subtest_expected_failure_x_3_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_subtest_names_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_two_things_subtest_expected_failure_x_1_y_4_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_two_things_subtest_expected_failure_x_1_y_5_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_two_things_subtest_expected_failure_x_1_y_6_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_two_things_subtest_expected_failure_x_2_y_4_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_two_things_subtest_expected_failure_x_2_y_5_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_two_things_subtest_expected_failure_x_2_y_6_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_two_things_subtest_expected_failure_x_3_y_4_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_two_things_subtest_expected_failure_x_3_y_5_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_two_things_subtest_expected_failure_x_3_y_6_cuda, test/test_testing.py::TestTestParametrizationDeviceTypeCUDA::test_unparametrized_names_cuda, test/test_testing.py::TestImports::test_circular_dependencies, test/test_testing.py::TestImports::test_lazy_imports_are_lazy, test/test_testing.py::TestImports::test_no_mutate_global_logging_on_import_path_functorch, test/test_testing.py::TestImports::test_no_mutate_global_logging_on_import_path_torch, test/test_testing.py::TestImports::test_no_warning_on_import, test/test_testing.py::TestImports::test_not_import_sympy, test/test_testing.py::TestOpInfos::test_sample_input, test/test_testing.py::TestOpInfos::test_sample_input_metadata, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_T_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators___radd___cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators___rand___cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators___rdiv___cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators___rmod___cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators___rmul___cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators___ror___cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators___rpow___cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators___rsub___cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators___rxor___cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators__chunk_cat_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_add_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_amax_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_amin_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_aminmax_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_arange_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_as_strided_scatter_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_atan2_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_bernoulli_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_bitwise_and_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_bitwise_left_shift_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_bitwise_or_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_bitwise_right_shift_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_bitwise_xor_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_bucketize_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_cat_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_cauchy_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_clamp_max_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_clamp_min_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_complex_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_copysign_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_cov_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_diag_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_diag_embed_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_diagonal_copy_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_diagonal_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_diff_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_div_floor_rounding_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_div_no_rounding_mode_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_div_trunc_rounding_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_dot_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_dsplit_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_dstack_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_empty_permuted_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_eq_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_exponential_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_eye_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_fft2_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_fft_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_fftn_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_hfft2_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_hfft_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_hfftn_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_ifft2_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_ifft_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_ifftn_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_ihfft2_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_ihfft_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_ihfftn_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_irfft2_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_irfft_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_irfftn_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_rfft2_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_rfft_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fft_rfftn_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fliplr_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_flipud_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_float_power_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_floor_divide_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fmax_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fmin_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_fmod_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_gather_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_gcd_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_ge_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_geometric_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_gradient_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_gt_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_heaviside_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_histogramdd_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_hsplit_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_hstack_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_hypot_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_igamma_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_igammac_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_index_add_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_index_select_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_isclose_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_item_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_jiterator_binary_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_jiterator_binary_return_by_ref_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_kthvalue_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_lcm_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_ldexp_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_le_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_linalg_cross_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_linalg_diagonal_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_linalg_lstsq_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_linalg_lstsq_grad_oriented_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_linspace_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_linspace_tensor_overload_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_log_normal_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_logaddexp_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_logcumsumexp_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_logical_and_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_logical_or_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_logical_xor_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_logspace_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_logspace_tensor_overload_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_lt_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_masked_fill_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_masked_scatter_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_masked_select_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_max_binary_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_maximum_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_mean_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_median_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_min_binary_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_minimum_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_movedim_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_mul_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_multinomial_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_narrow_copy_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_narrow_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_native_layer_norm_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_ne_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_neg_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nextafter_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_adaptive_avg_pool1d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_adaptive_avg_pool2d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_adaptive_avg_pool3d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_adaptive_max_pool1d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_adaptive_max_pool2d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_adaptive_max_pool3d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_avg_pool1d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_avg_pool2d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_avg_pool3d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_conv1d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_conv2d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_conv3d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_embedding_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_gaussian_nll_loss_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_gelu_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_group_norm_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_hardtanh_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_hinge_embedding_loss_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_huber_loss_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_l1_loss_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_margin_ranking_loss_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_max_pool1d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_max_pool2d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_max_pool3d_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_multi_margin_loss_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_multilabel_margin_loss_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_poisson_nll_loss_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_prelu_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_rms_norm_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_rrelu_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_soft_margin_loss_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_softshrink_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_triplet_margin_loss_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_nn_functional_triplet_margin_with_distance_loss_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_normal_in_place_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_ormqr_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_polar_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_pow_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_remainder_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_renorm_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_reshape_as_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_reshape_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_roll_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_rot90_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_rsub_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_scatter_add_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_scatter_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_signal_windows_bartlett_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_signal_windows_blackman_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_signal_windows_cosine_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_signal_windows_exponential_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_signal_windows_gaussian_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_signal_windows_general_cosine_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_signal_windows_general_hamming_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_signal_windows_hamming_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_signal_windows_hann_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_signal_windows_kaiser_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_signal_windows_nuttall_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_chebyshev_polynomial_t_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_chebyshev_polynomial_u_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_chebyshev_polynomial_v_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_chebyshev_polynomial_w_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_hermite_polynomial_h_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_hermite_polynomial_he_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_laguerre_polynomial_l_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_legendre_polynomial_p_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_shifted_chebyshev_polynomial_t_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_shifted_chebyshev_polynomial_u_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_shifted_chebyshev_polynomial_v_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_shifted_chebyshev_polynomial_w_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_xlog1py_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_special_zeta_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_sub_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_sum_to_size_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_t_copy_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_t_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_take_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_trace_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_tril_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_triu_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_true_divide_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_unbind_copy_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_unbind_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_uniform_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_vdot_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_view_as_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_view_copy_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_view_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_vsplit_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_vstack_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_where_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_error_generators_xlogy_cuda, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators___radd___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators___rand___cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators___rdiv___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators___rmod___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators___rmul___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators___ror___cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators___rpow___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators___rsub___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators___rxor___cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_abs_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_acos_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_acosh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_add_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_addcdiv_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_addcmul_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_angle_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_asin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_asinh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_atan2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_atan_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_atanh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_bfloat16_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_bitwise_and_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_bitwise_left_shift_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_bitwise_not_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_bitwise_or_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_bitwise_right_shift_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_bitwise_xor_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_bool_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_broadcast_tensors_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_bucketize_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_byte_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_cat_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_cdouble_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_ceil_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_cfloat_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_chalf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_char_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_chunk_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_clamp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_clamp_max_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_clamp_min_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_clone_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_complex_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_conj_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_conj_physical_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_contiguous_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_copysign_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_cos_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_cosh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_deg2rad_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_diag_embed_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_diagonal_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_diagonal_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_digamma_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_div_floor_rounding_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_div_no_rounding_mode_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_div_trunc_rounding_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_double_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_empty_like_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_eq_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_erf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_erfc_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_erfinv_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_exp2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_exp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_expm1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_fill_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_flatten_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_float_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_float_power_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_floor_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_floor_divide_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_fmax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_fmin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_fmod_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_frac_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_frexp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_gcd_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_ge_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_gt_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_half_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_heaviside_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_hypot_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_i0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_igamma_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_igammac_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_imag_cuda_complex64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_index_add_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_index_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_index_fill_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_index_select_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_int_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_isclose_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_isfinite_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_isinf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_isnan_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_isneginf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_isposinf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_isreal_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_jiterator_binary_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_jiterator_binary_return_by_ref_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_jiterator_unary_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_lcm_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_ldexp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_le_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_lgamma_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_log10_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_log1p_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_log2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_log_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_logaddexp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_logical_and_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_logical_not_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_logical_or_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_logical_xor_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_logit_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_logsumexp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_long_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_lt_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_max_binary_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_maximum_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_min_binary_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_minimum_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_movedim_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_mul_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nan_to_num_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_narrow_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_narrow_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_ne_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_neg_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nextafter_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_celu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_elu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_grid_sample_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_group_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_hardshrink_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_hardsigmoid_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_hardtanh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_hinge_embedding_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_interpolate_bicubic_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_interpolate_bilinear_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_logsigmoid_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_margin_ranking_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_mish_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_multi_margin_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_multilabel_margin_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_prelu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_relu6_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_relu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_rrelu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_selu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_silu_complex_cuda_complex64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_silu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_softplus_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_softshrink_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_softsign_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_tanhshrink_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_threshold_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_nn_functional_upsample_bilinear_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_permute_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_permute_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_polar_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_polygamma_polygamma_n_0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_polygamma_polygamma_n_1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_polygamma_polygamma_n_2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_polygamma_polygamma_n_3_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_polygamma_polygamma_n_4_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_positive_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_pow_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_rad2deg_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_real_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_reciprocal_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_remainder_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_reshape_as_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_reshape_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_round_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_round_decimals_0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_round_decimals_3_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_round_decimals_neg_3_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_rsqrt_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_rsub_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_sgn_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_short_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_sigmoid_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_sign_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signal_windows_bartlett_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signal_windows_blackman_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signal_windows_cosine_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signal_windows_exponential_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signal_windows_gaussian_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signal_windows_general_cosine_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signal_windows_general_hamming_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signal_windows_hamming_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signal_windows_hann_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signal_windows_kaiser_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signal_windows_nuttall_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_signbit_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_sin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_sinc_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_sinh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_airy_ai_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_bessel_j0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_bessel_j1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_bessel_y0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_bessel_y1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_chebyshev_polynomial_t_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_chebyshev_polynomial_u_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_chebyshev_polynomial_v_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_chebyshev_polynomial_w_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_entr_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_erfcx_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_hermite_polynomial_h_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_hermite_polynomial_he_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_i0e_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_i1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_i1e_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_laguerre_polynomial_l_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_legendre_polynomial_p_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_log_ndtr_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_modified_bessel_i0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_modified_bessel_i1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_modified_bessel_k0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_modified_bessel_k1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_ndtr_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_ndtri_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_scaled_modified_bessel_k0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_scaled_modified_bessel_k1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_spherical_bessel_j0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_xlog1py_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_special_zeta_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_sqrt_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_square_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_sub_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_tan_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_tanh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_true_divide_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_trunc_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_unsafe_chunk_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_view_as_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_view_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_where_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_reference_generators_xlogy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_H_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_T_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators___getitem___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators___radd___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators___rand___cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators___rdiv___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators___rmatmul___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators___rmod___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators___rmul___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators___ror___cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators___rpow___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators___rsub___cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators___rxor___cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators__batch_norm_with_update_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators__chunk_cat_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators__native_batch_norm_legit_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators__segment_reduce_lengths_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators__segment_reduce_offsets_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators__softmax_backward_data_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators__unsafe_masked_index_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators__unsafe_masked_index_put_accumulate_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators__upsample_bilinear2d_aa_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_abs_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_acos_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_acosh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_add_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_addbmm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_addcdiv_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_addcmul_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_addmm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_addmm_decomposed_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_addmv_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_addr_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_alias_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_all_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_allclose_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_amax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_amin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_aminmax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_angle_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_any_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_arange_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_argmax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_argmin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_argsort_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_argwhere_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_as_strided_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_as_strided_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_as_strided_partial_views_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_as_strided_scatter_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_asin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_asinh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_atan2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_atan_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_atanh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_atleast_1d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_atleast_2d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_atleast_3d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_baddbmm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bernoulli_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bfloat16_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bincount_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bitwise_and_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bitwise_left_shift_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bitwise_not_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bitwise_or_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bitwise_right_shift_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bitwise_xor_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_block_diag_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bmm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bool_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_broadcast_shapes_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_broadcast_tensors_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_broadcast_to_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_bucketize_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_byte_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cartesian_prod_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cat_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cauchy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cdist_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cdouble_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_ceil_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cfloat_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_chalf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_char_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cholesky_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cholesky_inverse_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cholesky_solve_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_chunk_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_clamp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_clamp_max_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_clamp_min_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_clone_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_column_stack_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_combinations_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_complex_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_conj_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_conj_physical_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_constant_pad_nd_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_contiguous_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_copysign_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_corrcoef_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cos_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cosh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_count_nonzero_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cov_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cross_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cummax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cummin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cumprod_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cumsum_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_cumulative_trapezoid_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_deg2rad_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_diag_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_diag_embed_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_diagflat_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_diagonal_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_diagonal_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_diagonal_scatter_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_diff_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_digamma_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_dist_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_div_floor_rounding_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_div_no_rounding_mode_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_div_trunc_rounding_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_dot_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_double_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_dsplit_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_dstack_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_einsum_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_empty_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_empty_like_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_empty_permuted_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_empty_strided_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_eq_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_equal_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_erf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_erfc_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_erfinv_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_exp2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_exp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_expand_as_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_expand_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_expand_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_expm1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_exponential_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_eye_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_fft2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_fft_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_fftn_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_fftshift_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_hfft2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_hfft_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_hfftn_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_ifft2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_ifft_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_ifftn_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_ifftshift_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_ihfft2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_ihfft_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_ihfftn_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_irfft2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_irfft_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_irfftn_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_rfft2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_rfft_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fft_rfftn_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fill_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_flatten_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_flip_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fliplr_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_flipud_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_float_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_float_power_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_floor_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_floor_divide_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fmax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fmin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_fmod_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_frac_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_frexp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_full_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_full_like_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_gather_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_gcd_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_ge_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_geometric_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_geqrf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_gradient_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_grid_sampler_2d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_gt_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_half_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_hash_tensor_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_heaviside_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_histc_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_hsplit_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_hstack_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_hypot_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_i0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_igamma_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_igammac_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_imag_cuda_complex64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_index_add_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_index_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_index_fill_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_index_put_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_index_reduce_amax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_index_reduce_amin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_index_reduce_mean_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_index_reduce_prod_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_index_select_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_inner_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_int_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_isclose_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_isfinite_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_isin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_isinf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_isnan_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_isneginf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_isposinf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_isreal_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_istft_cuda_complex64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_item_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_jiterator_2inputs_2outputs_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_jiterator_4inputs_with_extra_args_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_jiterator_binary_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_jiterator_binary_return_by_ref_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_jiterator_unary_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_kron_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_kthvalue_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_lcm_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_ldexp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_le_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_lerp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_lgamma_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_cholesky_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_cholesky_ex_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_cond_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_cross_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_det_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_diagonal_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_eig_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_eigh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_eigvals_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_eigvalsh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_householder_product_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_inv_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_inv_ex_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_ldl_factor_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_ldl_factor_ex_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_ldl_solve_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_lstsq_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_lstsq_grad_oriented_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_lu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_lu_factor_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_lu_factor_ex_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_lu_solve_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_matrix_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_matrix_power_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_matrix_rank_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_matrix_rank_hermitian_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_multi_dot_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_norm_subgradients_at_zero_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_pinv_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_pinv_hermitian_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_pinv_singular_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_qr_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_slogdet_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_solve_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_solve_ex_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_solve_triangular_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_svd_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_svdvals_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_tensorinv_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_tensorsolve_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_vander_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_vecdot_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linalg_vector_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linspace_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_linspace_tensor_overload_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_log10_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_log1p_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_log2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_log_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_log_normal_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_log_softmax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_log_softmax_with_dtype_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logaddexp2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logaddexp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logcumsumexp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logdet_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logical_and_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logical_not_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logical_or_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logical_xor_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logit_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logspace_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logspace_tensor_overload_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_logsumexp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_long_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_lt_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_lu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_lu_solve_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_lu_unpack_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_mH_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_mT_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_amax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_amin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_argmax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_argmin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_cumprod_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_cumsum_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_fill_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_log_softmax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_logaddexp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_logsumexp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_mean_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_median_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_normalize_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_prod_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_scatter_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_select_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_softmax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_softmin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_std_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_sum_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_masked_var_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_matmul_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_matrix_exp_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_max_binary_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_max_pool2d_with_indices_backward_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_max_reduction_no_dim_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_max_reduction_with_dim_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_maximum_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_mean_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_median_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_meshgrid_list_of_tensors_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_meshgrid_variadic_tensors_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_min_binary_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_min_reduction_no_dim_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_min_reduction_with_dim_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_minimum_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_mm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_mode_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_movedim_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_msort_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_mul_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_multinomial_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_mv_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nan_to_num_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nanmean_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nanmedian_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nanquantile_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nansum_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_narrow_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_narrow_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_native_batch_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_native_dropout_backward_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_native_layer_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_ne_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_neg_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_new_empty_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_new_empty_strided_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_new_full_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_new_ones_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_new_zeros_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nextafter_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_alpha_dropout_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_avg_pool1d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_avg_pool2d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_avg_pool3d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_batch_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_bilinear_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_binary_cross_entropy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_celu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_channel_shuffle_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_conv1d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_conv2d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_conv3d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_conv_transpose1d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_conv_transpose2d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_conv_transpose3d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_cosine_embedding_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_cosine_similarity_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_cross_entropy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_ctc_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_dropout2d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_dropout3d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_dropout_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_elu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_embedding_bag_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_embedding_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_fractional_max_pool2d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_fractional_max_pool3d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_gaussian_nll_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_gelu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_glu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_grid_sample_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_group_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_hardshrink_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_hardsigmoid_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_hardswish_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_hardtanh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_hinge_embedding_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_huber_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_instance_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_interpolate_area_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_interpolate_bicubic_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_interpolate_bilinear_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_interpolate_linear_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_interpolate_nearest_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_interpolate_trilinear_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_kl_div_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_l1_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_layer_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_leaky_relu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_linear_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_local_response_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_logsigmoid_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_margin_ranking_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_max_pool1d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_max_pool2d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_max_pool3d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_max_unpool1d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_max_unpool1d_grad_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_max_unpool2d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_max_unpool2d_grad_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_max_unpool3d_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_max_unpool3d_grad_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_mish_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_mse_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_multi_head_attention_forward_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_multi_margin_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_multilabel_margin_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_nll_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_normalize_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_one_hot_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_pad_circular_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_pad_constant_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_pad_reflect_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_pad_replicate_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_pad_replicate_negative_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_pairwise_distance_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_pdist_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_pixel_shuffle_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_pixel_unshuffle_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_poisson_nll_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_prelu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_relu6_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_relu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_rms_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_rrelu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_selu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_silu_complex_cuda_complex64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_silu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_smooth_l1_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_soft_margin_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_softmin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_softmin_with_dtype_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_softplus_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_softshrink_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_softsign_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_tanhshrink_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_threshold_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_triplet_margin_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_unfold_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_upsample_bilinear_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nn_functional_upsample_nearest_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nonzero_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_nonzero_static_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_norm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_norm_fro_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_norm_inf_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_norm_nuc_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_normal_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_normal_in_place_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_normal_number_mean_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_ones_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_ones_like_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_ormqr_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_outer_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_pca_lowrank_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_permute_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_permute_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_pinverse_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_polar_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_polygamma_polygamma_n_0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_polygamma_polygamma_n_1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_polygamma_polygamma_n_2_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_polygamma_polygamma_n_3_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_polygamma_polygamma_n_4_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_positive_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_pow_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_prod_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_put_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_qr_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_quantile_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_rad2deg_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_rand_like_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_randint_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_randint_like_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_randn_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_randn_like_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_ravel_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_real_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_reciprocal_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_remainder_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_renorm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_repeat_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_repeat_interleave_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_reshape_as_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_reshape_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_resize__cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_resize_as__cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_resolve_conj_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_resolve_neg_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_roll_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_rot90_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_round_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_round_decimals_0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_round_decimals_3_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_round_decimals_neg_3_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_rsqrt_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_rsub_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_scalar_tensor_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_scatter_add_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_scatter_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_scatter_reduce_amax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_scatter_reduce_amin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_scatter_reduce_mean_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_scatter_reduce_prod_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_scatter_reduce_sum_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_searchsorted_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_select_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_select_scatter_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sgn_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_short_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sigmoid_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sign_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signal_windows_bartlett_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signal_windows_blackman_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signal_windows_cosine_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signal_windows_exponential_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signal_windows_gaussian_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signal_windows_general_cosine_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signal_windows_general_hamming_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signal_windows_hamming_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signal_windows_hann_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signal_windows_kaiser_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signal_windows_nuttall_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_signbit_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sin_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sinc_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sinh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_slice_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_slice_scatter_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_softmax_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_softmax_with_dtype_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sort_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sparse_mm_reduce_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sparse_sampled_addmm_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_airy_ai_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_bessel_j0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_bessel_j1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_bessel_y0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_bessel_y1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_chebyshev_polynomial_t_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_chebyshev_polynomial_u_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_chebyshev_polynomial_v_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_chebyshev_polynomial_w_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_entr_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_erfcx_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_hermite_polynomial_h_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_hermite_polynomial_he_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_i0e_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_i1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_i1e_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_laguerre_polynomial_l_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_legendre_polynomial_p_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_log_ndtr_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_modified_bessel_i0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_modified_bessel_i1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_modified_bessel_k0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_modified_bessel_k1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_ndtr_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_ndtri_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_scaled_modified_bessel_k0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_scaled_modified_bessel_k1_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_spherical_bessel_j0_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_xlog1py_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_special_zeta_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_split_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_split_list_args_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_split_with_sizes_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_split_with_sizes_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sqrt_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_square_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_squeeze_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_squeeze_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_squeeze_multiple_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_stack_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_std_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_std_mean_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_std_mean_unbiased_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_std_unbiased_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_stft_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sub_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sum_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_sum_to_size_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_svd_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_svd_lowrank_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_t_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_t_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_take_along_dim_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_take_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_tan_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_tanh_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_tensor_split_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_tensordot_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_tile_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_to_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_to_sparse_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_topk_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_torch__scaled_mm_cuda_float8_e4m3fn, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_trace_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_transpose_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_transpose_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_trapezoid_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_trapz_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_triangular_solve_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_tril_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_tril_indices_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_triu_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_triu_indices_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_true_divide_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_trunc_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unbind_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unbind_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unflatten_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unfold_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unfold_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_uniform_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unique_consecutive_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unique_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unravel_index_cuda_int64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unsafe_chunk_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unsafe_split_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unsqueeze_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_unsqueeze_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_var_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_var_mean_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_var_mean_unbiased_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_var_unbiased_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_vdot_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_view_as_complex_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_view_as_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_view_as_real_cuda_complex64, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_view_copy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_view_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_vsplit_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_vstack_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_where_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_xlogy_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_zero__cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_zeros_cuda_float32, test/test_testing.py::TestOpInfoSampleFunctionsCUDA::test_opinfo_sample_generators_zeros_like_cuda_float32 2025-08-14T21:46:43.2258890Z 2025-08-14T21:46:47.2541494Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T21:46:47.2542852Z import pkg_resources 2025-08-14T21:46:47.6364791Z Running test_jit_fuser_te 1/1 ... [2025-08-14 21:46:47.635912] 2025-08-14T21:46:47.6365355Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:46:47.6367280Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_jit_fuser_te.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:46:47.636334] 2025-08-14T21:46:55.6340812Z 2025-08-14T21:46:55.6341792Z export/test_torchbind 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_torchbind_1.1_565cb737b1990be8_.log 2025-08-14T21:46:55.6381704Z Running 90 items in this shard: test/export/test_torchbind.py::TestExportTorchbind::test_aot_export_tensor_queue_operators, test/export/test_torchbind.py::TestExportTorchbind::test_attribute_as_custom_op_argument_pre_dispatch_False, test/export/test_torchbind.py::TestExportTorchbind::test_attribute_as_custom_op_argument_pre_dispatch_True, test/export/test_torchbind.py::TestExportTorchbind::test_attribute_pre_dispatch_False, test/export/test_torchbind.py::TestExportTorchbind::test_attribute_pre_dispatch_True, test/export/test_torchbind.py::TestExportTorchbind::test_custom_obj_list_out_pre_dispatch_False, test/export/test_torchbind.py::TestExportTorchbind::test_custom_obj_list_out_pre_dispatch_True, test/export/test_torchbind.py::TestExportTorchbind::test_custom_obj_tuple_out_pre_dispatch_False, test/export/test_torchbind.py::TestExportTorchbind::test_custom_obj_tuple_out_pre_dispatch_True, test/export/test_torchbind.py::TestExportTorchbind::test_custom_obj_unbacked_symint_pre_dispatch_False, test/export/test_torchbind.py::TestExportTorchbind::test_custom_obj_unbacked_symint_pre_dispatch_True, test/export/test_torchbind.py::TestExportTorchbind::test_deepcopy, test/export/test_torchbind.py::TestExportTorchbind::test_export_inplace_custom_op, test/export/test_torchbind.py::TestExportTorchbind::test_identifying_torchbind_ops, test/export/test_torchbind.py::TestExportTorchbind::test_input_as_custom_op_argument_pre_dispatch_False, test/export/test_torchbind.py::TestExportTorchbind::test_input_as_custom_op_argument_pre_dispatch_True, test/export/test_torchbind.py::TestExportTorchbind::test_input_pre_dispatch_False, test/export/test_torchbind.py::TestExportTorchbind::test_input_pre_dispatch_True, test/export/test_torchbind.py::TestExportTorchbind::test_make_fx_schema_checking_script_object, test/export/test_torchbind.py::TestExportTorchbind::test_make_fx_tensor_queue_methods_fakify_internal_states_make_fx_tracing_mode_fake, test/export/test_torchbind.py::TestExportTorchbind::test_make_fx_tensor_queue_methods_fakify_internal_states_make_fx_tracing_mode_symbolic, test/export/test_torchbind.py::TestExportTorchbind::test_make_fx_tensor_queue_methods_make_fx_tracing_mode_fake, test/export/test_torchbind.py::TestExportTorchbind::test_make_fx_tensor_queue_methods_make_fx_tracing_mode_symbolic, test/export/test_torchbind.py::TestExportTorchbind::test_make_fx_tensor_queue_operators_fallthrough_via_lib_impl, test/export/test_torchbind.py::TestExportTorchbind::test_make_fx_tensor_queue_operators_fallthrough_via_py_impl, test/export/test_torchbind.py::TestExportTorchbind::test_method_schema, test/export/test_torchbind.py::TestExportTorchbind::test_non_strict_export_methods, test/export/test_torchbind.py::TestExportTorchbind::test_none_pre_dispatch_False, test/export/test_torchbind.py::TestExportTorchbind::test_none_pre_dispatch_True, test/export/test_torchbind.py::TestExportTorchbind::test_safe_to_trace_with_real, test/export/test_torchbind.py::TestExportTorchbind::test_torchbind_alias_pre_dispatch_False, test/export/test_torchbind.py::TestExportTorchbind::test_torchbind_alias_pre_dispatch_True, test/export/test_torchbind.py::TestExportTorchbind::test_torchbind_input_and_alias_pre_dispatch_False, test/export/test_torchbind.py::TestExportTorchbind::test_torchbind_input_and_alias_pre_dispatch_True, test/export/test_torchbind.py::TestExportTorchbind::test_torchbind_op_fallthrough_keys_respects_lib_impl, test/export/test_torchbind.py::TestExportTorchbind::test_torchbind_op_register_fallthrough, test/export/test_torchbind.py::TestExportTorchbind::test_torchbind_register_attr_at_runtime_error, test/export/test_torchbind.py::TestExportTorchbind::test_unlift_custom_obj_pre_dispatch_False, test/export/test_torchbind.py::TestExportTorchbind::test_unlift_custom_obj_pre_dispatch_True, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_body_aliasing_contents_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_body_aliasing_contents_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_body_aliasing_contents_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_input_aliasing_contents_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_input_aliasing_contents_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_input_aliasing_contents_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_non_fakified_method_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_non_fakified_method_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_non_fakified_method_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_script_obj_missing_attr_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_script_obj_missing_attr_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_script_obj_setattr_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_script_obj_setattr_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_global_obj_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_global_obj_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_global_obj_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_as_hop_input_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_as_hop_input_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_as_hop_input_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_attributes_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_attributes_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_attributes_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_closure_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_closure_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_closure_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_graph_breaks, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_torchbind_op_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_torchbind_op_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_torchbind_op_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_torchbind_op_with_autocast_device_cpu_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_torchbind_op_with_autocast_device_cpu_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_torchbind_op_with_autocast_device_cpu_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_torchbind_op_with_autocast_device_cuda_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_torchbind_op_with_autocast_device_cuda_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_obj_torchbind_op_with_autocast_device_cuda_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_script_object_input_automatic_dynamic_shape, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_script_object_input_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_script_object_input_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_script_object_input_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_script_object_input_guards_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_script_object_input_guards_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_script_object_input_guards_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_tensor_op_in_tensor_flatten_backend_aot_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_tensor_op_in_tensor_flatten_backend_eager, test/export/test_torchbind.py::TestCompileTorchbind::test_compile_tensor_op_in_tensor_flatten_backend_inductor, test/export/test_torchbind.py::TestCompileTorchbind::test_export_obj_torchbind_op_with_autocast_device_cpu, test/export/test_torchbind.py::TestCompileTorchbind::test_export_obj_torchbind_op_with_autocast_device_cuda, test/export/test_torchbind.py::TestRegisterFakeClass::test_register_fake_class_from_real_not_classmethod, test/export/test_torchbind.py::TestRegisterFakeClass::test_register_fake_class_no_from_real, test/export/test_torchbind.py::TestRegisterFakeClass::test_register_fake_class_no_torch_bind_class, test/export/test_torchbind.py::TestRegisterFakeClass::test_register_fake_class_valid 2025-08-14T21:46:55.6415693Z 2025-08-14T21:46:59.6548949Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T21:46:59.6550508Z import pkg_resources 2025-08-14T21:47:00.0283771Z Running test_public_bindings 1/1 ... [2025-08-14 21:47:00.027928] 2025-08-14T21:47:00.0284204Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:47:00.0288422Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_public_bindings.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:47:00.028327] 2025-08-14T21:47:04.5018956Z 2025-08-14T21:47:04.5019703Z test_public_bindings 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_public_bindings_1.1_41f4f8e50525fdff_.log 2025-08-14T21:47:04.5021467Z Running 4 items in this shard: test/test_public_bindings.py::TestPublicBindings::test_correct_module_names, test/test_public_bindings.py::TestPublicBindings::test_modules_can_be_imported, test/test_public_bindings.py::TestPublicBindings::test_no_new_bindings, test/test_public_bindings.py::TestPublicBindings::test_no_new_reexport_callables 2025-08-14T21:47:04.5023118Z 2025-08-14T21:47:08.6352004Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T21:47:08.6353339Z import pkg_resources 2025-08-14T21:47:09.0156706Z Running test_linalg 1/1 ... [2025-08-14 21:47:09.015146] 2025-08-14T21:47:09.0157316Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:47:09.0160790Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_linalg.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:47:09.015567] 2025-08-14T21:48:54.8955265Z 2025-08-14T21:48:54.8958526Z test_jit_fuser_te 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_jit_fuser_te_1.1_a60f594d95c3cac2_.log 2025-08-14T21:48:55.1170311Z Running 6816 items in this shard: test/test_jit_fuser_te.py::TestFuserCommon::test_autodiff_fallback, test/test_jit_fuser_te.py::TestTEFuserStatic::test_abs, test/test_jit_fuser_te.py::TestTEFuserStatic::test_adaptive_avg_pool2d, test/test_jit_fuser_te.py::TestTEFuserStatic::test_add_bool, test/test_jit_fuser_te.py::TestTEFuserStatic::test_addcmul, test/test_jit_fuser_te.py::TestTEFuserStatic::test_arg_configurations_smoke, test/test_jit_fuser_te.py::TestTEFuserStatic::test_autocast_down, test/test_jit_fuser_te.py::TestTEFuserStatic::test_autocast_up, test/test_jit_fuser_te.py::TestTEFuserStatic::test_batch_norm, test/test_jit_fuser_te.py::TestTEFuserStatic::test_binary_div_ops, test/test_jit_fuser_te.py::TestTEFuserStatic::test_binary_ops, test/test_jit_fuser_te.py::TestTEFuserStatic::test_binary_pow, test/test_jit_fuser_te.py::TestTEFuserStatic::test_binary_scalar_ops, test/test_jit_fuser_te.py::TestTEFuserStatic::test_binary_tensor_scalar_ops, test/test_jit_fuser_te.py::TestTEFuserStatic::test_bitwise_ops, test/test_jit_fuser_te.py::TestTEFuserStatic::test_broadcast, test/test_jit_fuser_te.py::TestTEFuserStatic::test_cat_2k_args, test/test_jit_fuser_te.py::TestTEFuserStatic::test_cat_graph_opt, test/test_jit_fuser_te.py::TestTEFuserStatic::test_channels_last_dims_dynamic, test/test_jit_fuser_te.py::TestTEFuserStatic::test_checks_cat_inputs, test/test_jit_fuser_te.py::TestTEFuserStatic::test_chunk, test/test_jit_fuser_te.py::TestTEFuserStatic::test_chunk_correctness, test/test_jit_fuser_te.py::TestTEFuserStatic::test_chunk_distributes, test/test_jit_fuser_te.py::TestTEFuserStatic::test_chunk_motion_deduplicates_inputs, test/test_jit_fuser_te.py::TestTEFuserStatic::test_chunk_mul_one, test/test_jit_fuser_te.py::TestTEFuserStatic::test_chunk_multiple, test/test_jit_fuser_te.py::TestTEFuserStatic::test_clamp, test/test_jit_fuser_te.py::TestTEFuserStatic::test_clamp_double, test/test_jit_fuser_te.py::TestTEFuserStatic::test_clamp_int, test/test_jit_fuser_te.py::TestTEFuserStatic::test_comparison_eq_ne, test/test_jit_fuser_te.py::TestTEFuserStatic::test_comparison_ge_le, test/test_jit_fuser_te.py::TestTEFuserStatic::test_comparison_gt_lt, test/test_jit_fuser_te.py::TestTEFuserStatic::test_concat, test/test_jit_fuser_te.py::TestTEFuserStatic::test_concat_invariant, test/test_jit_fuser_te.py::TestTEFuserStatic::test_constant_chunk_shapes, test/test_jit_fuser_te.py::TestTEFuserStatic::test_conv2d, test/test_jit_fuser_te.py::TestTEFuserStatic::test_conv2d_depthwise, test/test_jit_fuser_te.py::TestTEFuserStatic::test_cuda_half, test/test_jit_fuser_te.py::TestTEFuserStatic::test_dims, test/test_jit_fuser_te.py::TestTEFuserStatic::test_disabled, test/test_jit_fuser_te.py::TestTEFuserStatic::test_div_bool, test/test_jit_fuser_te.py::TestTEFuserStatic::test_dynamic_cat, test/test_jit_fuser_te.py::TestTEFuserStatic::test_dynamic_shapes, test/test_jit_fuser_te.py::TestTEFuserStatic::test_eq_unsqueeze_type_as, test/test_jit_fuser_te.py::TestTEFuserStatic::test_erf, test/test_jit_fuser_te.py::TestTEFuserStatic::test_exhaust_specializations, test/test_jit_fuser_te.py::TestTEFuserStatic::test_exp, test/test_jit_fuser_te.py::TestTEFuserStatic::test_fusion_reuse_multi_gpu, test/test_jit_fuser_te.py::TestTEFuserStatic::test_gelu, test/test_jit_fuser_te.py::TestTEFuserStatic::test_hardsigmoid_fwd_bwd, test/test_jit_fuser_te.py::TestTEFuserStatic::test_hardswish_fwd_bwd, test/test_jit_fuser_te.py::TestTEFuserStatic::test_inlined_optimized_graph, test/test_jit_fuser_te.py::TestTEFuserStatic::test_isnan, test/test_jit_fuser_te.py::TestTEFuserStatic::test_kernel_cache_multi_gpu, test/test_jit_fuser_te.py::TestTEFuserStatic::test_lerp, test/test_jit_fuser_te.py::TestTEFuserStatic::test_list_ops, test/test_jit_fuser_te.py::TestTEFuserStatic::test_lstm, test/test_jit_fuser_te.py::TestTEFuserStatic::test_lstm_concat, test/test_jit_fuser_te.py::TestTEFuserStatic::test_lstm_gates_permutations, test/test_jit_fuser_te.py::TestTEFuserStatic::test_lstm_traced, test/test_jit_fuser_te.py::TestTEFuserStatic::test_masked_fill, test/test_jit_fuser_te.py::TestTEFuserStatic::test_matmul, test/test_jit_fuser_te.py::TestTEFuserStatic::test_milstm, test/test_jit_fuser_te.py::TestTEFuserStatic::test_minmax, test/test_jit_fuser_te.py::TestTEFuserStatic::test_minmax_int_ops, test/test_jit_fuser_te.py::TestTEFuserStatic::test_mul_bool, test/test_jit_fuser_te.py::TestTEFuserStatic::test_neg_pow, test/test_jit_fuser_te.py::TestTEFuserStatic::test_nonzero_device_cuda, test/test_jit_fuser_te.py::TestTEFuserStatic::test_nop, test/test_jit_fuser_te.py::TestTEFuserStatic::test_pow_multiple_dtype, test/test_jit_fuser_te.py::TestTEFuserStatic::test_profiler, test/test_jit_fuser_te.py::TestTEFuserStatic::test_rand_broadcast_cuda, test/test_jit_fuser_te.py::TestTEFuserStatic::test_rand_cuda, test/test_jit_fuser_te.py::TestTEFuserStatic::test_rand_diamond, test/test_jit_fuser_te.py::TestTEFuserStatic::test_relu, test/test_jit_fuser_te.py::TestTEFuserStatic::test_relu_fwd_bwd, test/test_jit_fuser_te.py::TestTEFuserStatic::test_remove_output_used_only_in_size, test/test_jit_fuser_te.py::TestTEFuserStatic::test_scalar, test/test_jit_fuser_te.py::TestTEFuserStatic::test_scalar_arg, test/test_jit_fuser_te.py::TestTEFuserStatic::test_scalar_only_inputs, test/test_jit_fuser_te.py::TestTEFuserStatic::test_skip_grad_in_check, test/test_jit_fuser_te.py::TestTEFuserStatic::test_small_constant, test/test_jit_fuser_te.py::TestTEFuserStatic::test_sub_gt_and, test/test_jit_fuser_te.py::TestTEFuserStatic::test_sum_dim, test/test_jit_fuser_te.py::TestTEFuserStatic::test_sum_keepdim_cast, test/test_jit_fuser_te.py::TestTEFuserStatic::test_sum_simple, test/test_jit_fuser_te.py::TestTEFuserStatic::test_superslomo, test/test_jit_fuser_te.py::TestTEFuserStatic::test_tensor_scalar_ops, test/test_jit_fuser_te.py::TestTEFuserStatic::test_ternary_norm_ops, test/test_jit_fuser_te.py::TestTEFuserStatic::test_ternary_ops, test/test_jit_fuser_te.py::TestTEFuserStatic::test_threshold, test/test_jit_fuser_te.py::TestTEFuserStatic::test_to_device, test/test_jit_fuser_te.py::TestTEFuserStatic::test_to_dtype, test/test_jit_fuser_te.py::TestTEFuserStatic::test_torch_to, test/test_jit_fuser_te.py::TestTEFuserStatic::test_type_as_cat, test/test_jit_fuser_te.py::TestTEFuserStatic::test_typecheck, test/test_jit_fuser_te.py::TestTEFuserStatic::test_unary_ops, test/test_jit_fuser_te.py::TestTEFuserStatic::test_unrolled_cat, test/test_jit_fuser_te.py::TestTEFuserStatic::test_unsqueeze_size_calculation, test/test_jit_fuser_te.py::TestTEFuserStatic::test_unsqueeze_var_dim, test/test_jit_fuser_te.py::TestTEFuserStatic::test_unsupported_dtypes, test/test_jit_fuser_te.py::TestTEFuserStatic::test_where_and_typing, test/test_jit_fuser_te.py::TestTEFuserStatic::test_where_ops, test/test_jit_fuser_te.py::TestTEFuserStatic::test_with_strict_fusion, test/test_jit_fuser_te.py::TestTEFuserStatic::test_zero_element_tensors, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_abs, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_adaptive_avg_pool2d, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_add_bool, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_addcmul, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_arg_configurations_smoke, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_autocast_down, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_autocast_up, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_batch_norm, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_binary_div_ops, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_binary_ops, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_binary_pow, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_binary_scalar_ops, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_binary_tensor_scalar_ops, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_bitwise_ops, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_broadcast, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_cat_2k_args, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_cat_graph_opt, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_channels_last_dims_dynamic, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_checks_cat_inputs, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_chunk, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_chunk_correctness, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_chunk_distributes, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_chunk_motion_deduplicates_inputs, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_chunk_mul_one, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_chunk_multiple, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_clamp, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_clamp_double, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_clamp_int, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_comparison_eq_ne, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_comparison_ge_le, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_comparison_gt_lt, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_concat, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_concat_invariant, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_constant_chunk_shapes, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_conv2d, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_conv2d_depthwise, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_cuda_half, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_dims, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_disabled, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_div_bool, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_dynamic_cat, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_dynamic_shapes, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_eq_unsqueeze_type_as, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_erf, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_exhaust_specializations, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_exp, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_fusion_reuse_multi_gpu, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_gelu, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_hardsigmoid_fwd_bwd, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_hardswish_fwd_bwd, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_inlined_optimized_graph, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_isnan, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_kernel_cache_multi_gpu, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_lerp, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_list_ops, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_lstm, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_lstm_concat, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_lstm_gates_permutations, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_lstm_traced, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_masked_fill, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_matmul, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_milstm, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_minmax, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_minmax_int_ops, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_mul_bool, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_neg_pow, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_nonzero_device_cuda, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_nop, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_pow_multiple_dtype, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_profiler, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_rand_broadcast_cuda, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_rand_cuda, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_rand_diamond, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_relu, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_relu_fwd_bwd, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_remove_output_used_only_in_size, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_scalar, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_scalar_arg, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_scalar_only_inputs, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_skip_grad_in_check, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_small_constant, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_sub_gt_and, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_sum_dim, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_sum_keepdim_cast, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_sum_simple, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_superslomo, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_tensor_scalar_ops, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_ternary_norm_ops, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_ternary_ops, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_threshold, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_to_device, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_to_dtype, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_torch_to, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_type_as_cat, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_typecheck, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_unary_ops, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_unrolled_cat, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_unsqueeze_size_calculation, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_unsqueeze_var_dim, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_unsupported_dtypes, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_where_and_typing, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_where_ops, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_with_strict_fusion, test/test_jit_fuser_te.py::TestTEFuserDynamic::test_zero_element_tensors, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_failures___rmatmul___cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_failures_frac_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_failures_matmul_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_H_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_H_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_H_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_H_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_H_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_H_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_H_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_H_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_H_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_H_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_H_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_H_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_H_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_T_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_T_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_T_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_T_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_T_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_T_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_T_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_T_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_T_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_T_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_T_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_T_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_T_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___getitem___cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___getitem___cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___getitem___cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___getitem___cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___getitem___cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___getitem___cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___getitem___cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___getitem___cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___getitem___cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___getitem___cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___getitem___cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___getitem___cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___getitem___cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___radd___cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___radd___cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___radd___cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___radd___cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___radd___cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___radd___cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___radd___cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___radd___cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___radd___cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___radd___cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___radd___cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___radd___cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rand___cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rand___cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rand___cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rand___cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rand___cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rand___cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rdiv___cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rdiv___cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rdiv___cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rdiv___cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rdiv___cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rdiv___cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rdiv___cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rdiv___cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rdiv___cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rdiv___cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rdiv___cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rdiv___cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmod___cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmod___cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmod___cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmod___cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmod___cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmod___cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmod___cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmod___cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmod___cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmul___cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmul___cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmul___cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmul___cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmul___cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmul___cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmul___cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmul___cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmul___cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmul___cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmul___cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rmul___cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___ror___cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___ror___cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___ror___cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___ror___cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___ror___cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___ror___cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rpow___cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rpow___cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rpow___cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rpow___cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rpow___cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rpow___cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rpow___cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rpow___cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rpow___cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rpow___cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rpow___cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rsub___cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rsub___cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rsub___cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rsub___cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rsub___cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rsub___cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rsub___cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rsub___cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rsub___cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rsub___cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rsub___cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rxor___cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rxor___cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rxor___cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rxor___cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rxor___cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness___rxor___cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__batch_norm_with_update_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__batch_norm_with_update_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__batch_norm_with_update_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__batch_norm_with_update_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__chunk_cat_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__chunk_cat_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__chunk_cat_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__chunk_cat_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__chunk_cat_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__chunk_cat_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__chunk_cat_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__chunk_cat_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__chunk_cat_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__chunk_cat_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__chunk_cat_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__chunk_cat_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__chunk_cat_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__native_batch_norm_legit_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__native_batch_norm_legit_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__native_batch_norm_legit_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__native_batch_norm_legit_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__segment_reduce_lengths_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__segment_reduce_lengths_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__segment_reduce_lengths_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__segment_reduce_lengths_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__segment_reduce_offsets_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__segment_reduce_offsets_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__segment_reduce_offsets_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__segment_reduce_offsets_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__softmax_backward_data_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__softmax_backward_data_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__softmax_backward_data_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__softmax_backward_data_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_put_accumulate_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_put_accumulate_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_put_accumulate_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_put_accumulate_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_put_accumulate_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_put_accumulate_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_put_accumulate_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_put_accumulate_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_put_accumulate_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__unsafe_masked_index_put_accumulate_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__upsample_bilinear2d_aa_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__upsample_bilinear2d_aa_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__upsample_bilinear2d_aa_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness__upsample_bilinear2d_aa_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_abs_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_abs_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_abs_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_abs_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_abs_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_abs_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_abs_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_abs_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_abs_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_abs_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_abs_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_abs_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_abs_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acos_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acos_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acos_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acos_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acos_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acos_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acos_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acos_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acos_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acos_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acos_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acos_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acos_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acosh_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acosh_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acosh_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acosh_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acosh_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acosh_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acosh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acosh_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acosh_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acosh_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acosh_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acosh_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_acosh_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_add_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_add_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_add_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_add_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_add_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_add_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_add_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_add_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_add_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_add_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_add_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_add_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_add_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addbmm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addbmm_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addbmm_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addbmm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addbmm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addbmm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcdiv_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcdiv_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcdiv_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcdiv_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcdiv_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcdiv_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcmul_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcmul_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcmul_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcmul_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcmul_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcmul_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcmul_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcmul_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcmul_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcmul_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addcmul_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmm_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmm_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmm_decomposed_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmm_decomposed_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmm_decomposed_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmm_decomposed_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmm_decomposed_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmm_decomposed_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmv_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmv_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmv_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmv_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmv_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addmv_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addr_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addr_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addr_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addr_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addr_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addr_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addr_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addr_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addr_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addr_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addr_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_addr_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_alias_copy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_alias_copy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_alias_copy_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_alias_copy_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_alias_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_alias_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_alias_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_alias_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_alias_copy_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_alias_copy_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_alias_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_alias_copy_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_alias_copy_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_all_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_all_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_all_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_all_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_all_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_all_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_all_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_all_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_all_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_all_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_all_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_all_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_allclose_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_allclose_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_allclose_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_allclose_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_allclose_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_allclose_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amax_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amax_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amax_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amax_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amax_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amax_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amax_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amax_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amax_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amin_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amin_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amin_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amin_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amin_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amin_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amin_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amin_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_amin_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_aminmax_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_aminmax_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_aminmax_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_aminmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_aminmax_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_aminmax_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_aminmax_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_aminmax_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_aminmax_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_aminmax_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_angle_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_angle_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_angle_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_angle_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_angle_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_angle_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_angle_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_angle_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_angle_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_angle_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_angle_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_any_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_any_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_any_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_any_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_any_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_any_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_any_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_any_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_any_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_any_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_any_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_any_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_arange_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_arange_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_arange_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_arange_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_arange_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_arange_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_arange_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_arange_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_arange_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmax_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmax_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmax_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmax_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmax_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmax_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmax_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmax_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmin_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmin_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmin_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmin_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmin_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmin_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmin_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argmin_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argsort_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argsort_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argsort_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argsort_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argsort_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argsort_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argsort_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argsort_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argsort_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argsort_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argwhere_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argwhere_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argwhere_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argwhere_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argwhere_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argwhere_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argwhere_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argwhere_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argwhere_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argwhere_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argwhere_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_argwhere_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_copy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_copy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_copy_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_copy_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_copy_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_copy_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_copy_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_copy_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_partial_views_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_partial_views_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_partial_views_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_partial_views_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_partial_views_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_partial_views_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_partial_views_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_partial_views_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_partial_views_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_partial_views_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_partial_views_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_partial_views_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_partial_views_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_scatter_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_scatter_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_scatter_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_scatter_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_scatter_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_scatter_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_scatter_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_scatter_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_scatter_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_scatter_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_scatter_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_scatter_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_as_strided_scatter_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asin_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asin_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asin_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asin_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asin_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asin_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asin_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asin_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asin_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asin_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asin_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asin_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asinh_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asinh_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asinh_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asinh_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asinh_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asinh_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asinh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asinh_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asinh_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asinh_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asinh_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asinh_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_asinh_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan2_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan2_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan2_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan2_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan2_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan2_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan2_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan2_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan2_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atan_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atanh_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atanh_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atanh_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atanh_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atanh_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atanh_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atanh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atanh_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atanh_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atanh_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atanh_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atanh_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atanh_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_1d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_1d_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_1d_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_1d_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_1d_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_1d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_1d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_1d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_1d_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_1d_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_1d_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_1d_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_1d_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_2d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_2d_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_2d_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_2d_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_2d_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_2d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_2d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_2d_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_2d_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_2d_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_2d_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_2d_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_3d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_3d_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_3d_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_3d_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_3d_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_3d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_3d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_3d_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_3d_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_3d_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_3d_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_atleast_3d_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_baddbmm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_baddbmm_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_baddbmm_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_baddbmm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_baddbmm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_baddbmm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bernoulli_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bernoulli_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bernoulli_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bernoulli_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bfloat16_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bfloat16_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bfloat16_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bfloat16_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bfloat16_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bfloat16_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bfloat16_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bfloat16_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bfloat16_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bfloat16_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bfloat16_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bfloat16_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bfloat16_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bincount_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bincount_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bincount_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bincount_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bincount_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_and_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_and_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_and_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_and_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_and_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_and_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_left_shift_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_left_shift_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_left_shift_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_left_shift_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_left_shift_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_not_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_not_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_not_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_not_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_not_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_not_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_or_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_or_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_or_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_or_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_or_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_or_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_right_shift_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_right_shift_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_right_shift_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_right_shift_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_right_shift_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_xor_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_xor_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_xor_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_xor_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_xor_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bitwise_xor_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_block_diag_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_block_diag_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_block_diag_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_block_diag_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_block_diag_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_block_diag_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_block_diag_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_block_diag_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_block_diag_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_block_diag_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_block_diag_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_block_diag_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_block_diag_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bmm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bmm_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bmm_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bmm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bmm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bmm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bool_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bool_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bool_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bool_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bool_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bool_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bool_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bool_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bool_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bool_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bool_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bool_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bool_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_shapes_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_tensors_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_tensors_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_tensors_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_tensors_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_tensors_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_tensors_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_tensors_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_tensors_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_tensors_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_tensors_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_tensors_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_tensors_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_to_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_to_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_to_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_to_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_to_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_to_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_to_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_to_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_to_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_to_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_to_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_broadcast_to_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bucketize_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bucketize_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bucketize_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bucketize_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bucketize_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bucketize_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bucketize_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bucketize_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_bucketize_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_byte_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_byte_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_byte_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_byte_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_byte_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_byte_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_byte_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_byte_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_byte_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_byte_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_byte_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_byte_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cartesian_prod_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cartesian_prod_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cartesian_prod_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cartesian_prod_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cartesian_prod_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cartesian_prod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cartesian_prod_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cartesian_prod_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cartesian_prod_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cartesian_prod_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cartesian_prod_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cartesian_prod_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cat_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cat_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cat_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cat_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cat_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cat_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cat_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cat_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cat_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cat_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cat_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cat_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cat_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cauchy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cauchy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cauchy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cauchy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cdist_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cdist_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cdouble_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cdouble_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cdouble_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cdouble_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cdouble_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cdouble_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cdouble_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cdouble_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cdouble_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cdouble_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cdouble_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cdouble_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cdouble_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ceil_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ceil_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ceil_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ceil_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ceil_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ceil_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ceil_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ceil_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ceil_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cfloat_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cfloat_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cfloat_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cfloat_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cfloat_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cfloat_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cfloat_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cfloat_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cfloat_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cfloat_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cfloat_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cfloat_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cfloat_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chalf_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chalf_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chalf_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chalf_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chalf_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chalf_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chalf_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chalf_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chalf_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chalf_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chalf_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chalf_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chalf_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_char_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_char_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_char_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_char_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_char_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_char_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_char_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_char_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_char_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_char_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_char_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_char_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_char_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cholesky_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cholesky_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cholesky_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cholesky_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cholesky_inverse_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cholesky_inverse_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cholesky_inverse_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cholesky_inverse_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cholesky_solve_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cholesky_solve_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cholesky_solve_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cholesky_solve_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chunk_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chunk_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chunk_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chunk_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chunk_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chunk_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chunk_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chunk_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chunk_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chunk_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chunk_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chunk_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_chunk_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_max_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_max_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_max_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_max_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_max_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_max_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_max_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_max_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_max_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_max_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_min_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_min_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_min_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_min_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_min_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_min_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_min_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_min_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_min_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clamp_min_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clone_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clone_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clone_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clone_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clone_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clone_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clone_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clone_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clone_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clone_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clone_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clone_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_clone_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_column_stack_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_combinations_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_combinations_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_combinations_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_combinations_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_combinations_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_combinations_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_combinations_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_combinations_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_combinations_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_combinations_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_combinations_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_combinations_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_complex_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_complex_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_complex_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_physical_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_physical_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_physical_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_physical_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_physical_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_physical_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_physical_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_physical_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_physical_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_physical_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_physical_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_physical_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_conj_physical_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_constant_pad_nd_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_constant_pad_nd_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_constant_pad_nd_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_constant_pad_nd_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_constant_pad_nd_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_constant_pad_nd_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_constant_pad_nd_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_constant_pad_nd_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_constant_pad_nd_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_constant_pad_nd_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_constant_pad_nd_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_constant_pad_nd_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_contiguous_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_contiguous_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_contiguous_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_contiguous_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_contiguous_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_contiguous_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_contiguous_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_contiguous_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_contiguous_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_contiguous_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_contiguous_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_contiguous_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_contiguous_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_copysign_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_copysign_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_copysign_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_copysign_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_copysign_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_copysign_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_copysign_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_copysign_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_copysign_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_copysign_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_corrcoef_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_corrcoef_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_corrcoef_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_corrcoef_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_corrcoef_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_corrcoef_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_corrcoef_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_corrcoef_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_corrcoef_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_corrcoef_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_corrcoef_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cos_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cos_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cos_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cos_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cos_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cos_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cos_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cos_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cos_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cos_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cos_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cos_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cos_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cosh_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cosh_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cosh_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cosh_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cosh_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cosh_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cosh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cosh_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cosh_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cosh_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cosh_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cosh_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cosh_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_count_nonzero_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_count_nonzero_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_count_nonzero_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_count_nonzero_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_count_nonzero_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_count_nonzero_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_count_nonzero_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_count_nonzero_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_count_nonzero_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_count_nonzero_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_count_nonzero_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_count_nonzero_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cov_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cov_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cov_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cov_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cov_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cov_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cov_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cov_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cov_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cov_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cov_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cross_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cross_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cross_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cross_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cross_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cross_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cross_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cross_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cross_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cross_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cross_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummax_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummax_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummax_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummax_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummax_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummax_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummax_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummax_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummax_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummin_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummin_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummin_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummin_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummin_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummin_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummin_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummin_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cummin_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumprod_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumprod_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumprod_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumprod_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumprod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumprod_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumprod_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumprod_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumprod_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumprod_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumprod_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumsum_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumsum_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumsum_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumsum_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumsum_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumsum_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumsum_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumsum_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumsum_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumsum_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumsum_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumulative_trapezoid_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumulative_trapezoid_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumulative_trapezoid_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumulative_trapezoid_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumulative_trapezoid_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumulative_trapezoid_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumulative_trapezoid_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumulative_trapezoid_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumulative_trapezoid_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumulative_trapezoid_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_cumulative_trapezoid_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_deg2rad_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_deg2rad_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_deg2rad_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_deg2rad_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_deg2rad_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_deg2rad_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_deg2rad_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_deg2rad_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_deg2rad_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_deg2rad_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_embed_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_embed_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_embed_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_embed_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_embed_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_embed_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_embed_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_embed_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_embed_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_embed_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_embed_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_embed_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diag_embed_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagflat_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagflat_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagflat_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagflat_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagflat_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagflat_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagflat_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagflat_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagflat_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagflat_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagflat_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagflat_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_copy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_copy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_copy_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_copy_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_copy_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_copy_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_copy_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_copy_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_scatter_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_scatter_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_scatter_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_scatter_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_scatter_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_scatter_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_scatter_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_scatter_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_scatter_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_scatter_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_scatter_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diagonal_scatter_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diff_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diff_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diff_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diff_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diff_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diff_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diff_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diff_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diff_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diff_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diff_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_diff_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_digamma_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_digamma_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_digamma_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_digamma_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_digamma_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_digamma_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_digamma_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_digamma_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_digamma_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_digamma_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dist_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dist_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dist_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dist_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dist_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dist_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_floor_rounding_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_floor_rounding_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_floor_rounding_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_floor_rounding_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_floor_rounding_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_floor_rounding_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_floor_rounding_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_floor_rounding_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_floor_rounding_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_no_rounding_mode_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_no_rounding_mode_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_no_rounding_mode_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_no_rounding_mode_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_no_rounding_mode_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_no_rounding_mode_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_no_rounding_mode_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_no_rounding_mode_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_no_rounding_mode_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_no_rounding_mode_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_no_rounding_mode_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_no_rounding_mode_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_no_rounding_mode_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_trunc_rounding_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_trunc_rounding_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_trunc_rounding_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_trunc_rounding_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_trunc_rounding_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_trunc_rounding_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_trunc_rounding_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_trunc_rounding_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_div_trunc_rounding_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dot_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dot_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dot_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dot_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dot_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dot_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_double_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_double_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_double_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_double_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_double_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_double_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_double_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_double_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_double_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_double_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_double_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_double_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_double_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dsplit_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dsplit_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dsplit_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dsplit_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dsplit_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dsplit_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dsplit_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dsplit_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dsplit_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dsplit_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dsplit_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dsplit_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dsplit_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dstack_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dstack_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dstack_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dstack_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dstack_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dstack_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dstack_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dstack_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dstack_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dstack_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dstack_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dstack_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_dstack_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_einsum_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_einsum_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_einsum_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_einsum_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_einsum_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_einsum_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_like_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_like_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_like_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_like_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_like_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_like_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_like_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_like_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_like_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_like_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_like_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_like_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_like_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_permuted_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_permuted_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_permuted_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_permuted_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_permuted_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_permuted_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_permuted_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_permuted_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_permuted_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_permuted_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_permuted_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_permuted_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_permuted_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_strided_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_strided_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_strided_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_strided_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_strided_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_strided_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_strided_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_strided_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_strided_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_strided_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_strided_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_empty_strided_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eq_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eq_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eq_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eq_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eq_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eq_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eq_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eq_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eq_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eq_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eq_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eq_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eq_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_equal_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_equal_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_equal_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_equal_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_equal_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_equal_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_equal_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_equal_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_equal_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_equal_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_equal_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_equal_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erf_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erf_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erf_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erf_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erf_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erf_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erf_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erf_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erf_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erf_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfc_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfc_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfc_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfc_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfc_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfc_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfc_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfc_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfc_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfc_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfinv_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfinv_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfinv_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfinv_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfinv_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfinv_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfinv_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfinv_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfinv_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_erfinv_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp2_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp2_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp2_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp2_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp2_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp2_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp2_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp2_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp2_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp2_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp2_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exp_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_as_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_as_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_as_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_as_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_as_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_as_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_as_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_as_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_as_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_as_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_as_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_as_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_copy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_copy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_copy_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_copy_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_copy_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_copy_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_copy_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expand_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expm1_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expm1_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expm1_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expm1_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expm1_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expm1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expm1_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expm1_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expm1_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expm1_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expm1_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_expm1_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exponential_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exponential_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exponential_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_exponential_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eye_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eye_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eye_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eye_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eye_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eye_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eye_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eye_cuda_float8_e4m3fn, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eye_cuda_float8_e4m3fnuz, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eye_cuda_float8_e5m2, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eye_cuda_float8_e5m2fnuz, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eye_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eye_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eye_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eye_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_eye_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft2_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft2_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft2_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft2_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft2_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft2_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft2_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft2_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft2_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft2_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft2_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fft_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftn_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftn_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftn_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftn_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftn_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftn_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftn_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftn_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftn_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftn_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftn_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftshift_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftshift_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftshift_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftshift_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftshift_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftshift_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftshift_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftshift_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftshift_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftshift_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftshift_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftshift_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_fftshift_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft2_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft2_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft2_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft2_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft2_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft2_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft2_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft2_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft2_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft2_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft2_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfft_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfftn_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfftn_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfftn_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfftn_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfftn_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfftn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfftn_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfftn_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfftn_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfftn_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfftn_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_hfftn_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft2_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft2_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft2_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft2_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft2_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft2_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft2_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft2_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft2_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft2_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft2_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifft_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftn_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftn_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftn_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftn_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftn_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftn_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftn_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftn_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftn_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftn_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftn_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftshift_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftshift_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftshift_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftshift_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftshift_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftshift_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftshift_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftshift_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftshift_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftshift_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftshift_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftshift_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ifftshift_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft2_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft2_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft2_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft2_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft2_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft2_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft2_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft2_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfft_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfftn_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfftn_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfftn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfftn_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfftn_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfftn_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfftn_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfftn_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_ihfftn_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft2_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft2_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft2_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft2_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft2_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft2_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft2_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft2_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft2_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft2_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft2_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfft_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfftn_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfftn_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfftn_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfftn_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfftn_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfftn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfftn_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfftn_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfftn_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfftn_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfftn_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_irfftn_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft2_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft2_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft2_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft2_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft2_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft2_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft2_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft2_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfft_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfftn_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfftn_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfftn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfftn_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfftn_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfftn_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfftn_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfftn_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fft_rfftn_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fill_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fill_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fill_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fill_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fill_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fill_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fill_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fill_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fill_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fill_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fill_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fill_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fill_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flatten_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flatten_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flatten_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flatten_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flatten_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flatten_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flatten_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flatten_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flatten_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flatten_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flatten_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flatten_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flatten_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flip_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flip_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flip_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flip_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flip_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flip_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flip_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flip_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flip_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flip_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flip_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flip_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fliplr_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fliplr_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fliplr_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fliplr_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fliplr_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fliplr_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fliplr_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fliplr_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fliplr_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fliplr_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fliplr_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fliplr_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flipud_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flipud_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flipud_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flipud_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flipud_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flipud_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flipud_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flipud_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flipud_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flipud_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flipud_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_flipud_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_power_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_power_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_power_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_power_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_power_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_power_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_power_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_power_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_power_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_power_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_power_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_float_power_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_divide_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_divide_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_divide_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_divide_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_divide_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_divide_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_divide_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_divide_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_floor_divide_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmax_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmax_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmax_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmax_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmax_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmax_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmax_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmax_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmax_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmin_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmin_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmin_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmin_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmin_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmin_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmin_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmin_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmin_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmod_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmod_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmod_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmod_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmod_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmod_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmod_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_fmod_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_frexp_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_frexp_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_frexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_frexp_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_like_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_like_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_like_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_like_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_like_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_like_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_like_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_like_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_like_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_like_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_like_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_like_cuda_uint16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_like_cuda_uint32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_full_like_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gather_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gather_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gather_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gather_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gather_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gather_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gather_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gather_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gather_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gather_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gather_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gather_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gcd_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gcd_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gcd_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gcd_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gcd_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ge_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ge_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ge_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ge_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ge_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ge_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ge_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ge_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ge_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ge_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_geometric_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_geometric_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_geometric_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_geometric_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_geometric_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_geometric_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_geometric_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_geometric_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_geometric_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_geqrf_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_geqrf_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_geqrf_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_geqrf_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gradient_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gradient_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gradient_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gradient_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gradient_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gradient_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gradient_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gradient_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gradient_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gradient_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_grid_sampler_2d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_grid_sampler_2d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_grid_sampler_2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_grid_sampler_2d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gt_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gt_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gt_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gt_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gt_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gt_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gt_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gt_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gt_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_gt_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_half_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_half_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_half_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_half_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_half_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_half_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_half_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_half_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_half_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_half_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_half_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_half_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hash_tensor_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hash_tensor_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hash_tensor_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hash_tensor_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hash_tensor_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hash_tensor_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hash_tensor_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hash_tensor_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hash_tensor_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hash_tensor_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_heaviside_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_heaviside_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_heaviside_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_heaviside_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_heaviside_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_heaviside_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_heaviside_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_heaviside_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_heaviside_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_heaviside_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_histc_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_histc_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_histc_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_histc_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_histc_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_histc_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_histc_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hsplit_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hsplit_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hsplit_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hsplit_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hsplit_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hsplit_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hsplit_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hsplit_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hsplit_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hsplit_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hsplit_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hsplit_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hsplit_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hstack_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hstack_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hstack_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hstack_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hstack_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hstack_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hstack_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hstack_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hstack_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hstack_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hstack_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hstack_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hstack_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hypot_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hypot_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hypot_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_hypot_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_i0_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_i0_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_i0_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_i0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_i0_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_i0_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_i0_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_i0_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_i0_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_i0_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_igamma_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_igamma_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_igammac_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_igammac_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_imag_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_imag_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_imag_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_add_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_add_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_add_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_add_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_add_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_add_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_add_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_add_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_add_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_add_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_add_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_add_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_add_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_copy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_copy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_copy_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_copy_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_copy_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_copy_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_copy_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_copy_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_fill_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_fill_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_fill_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_fill_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_fill_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_fill_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_fill_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_fill_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_fill_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_fill_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_fill_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_fill_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_fill_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_put_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_put_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_put_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_put_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_put_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_put_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_put_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_put_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_put_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_put_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_put_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_put_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_put_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amax_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amax_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amax_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amax_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amax_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amax_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amax_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amax_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amin_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amin_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amin_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amin_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amin_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amin_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amin_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_amin_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_mean_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_mean_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_mean_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_mean_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_mean_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_mean_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_mean_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_mean_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_mean_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_prod_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_prod_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_prod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_prod_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_prod_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_prod_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_prod_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_prod_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_reduce_prod_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_select_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_select_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_select_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_select_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_select_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_select_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_select_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_select_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_select_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_select_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_select_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_select_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_index_select_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_inner_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_inner_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_inner_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_inner_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_inner_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_inner_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_int_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_int_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_int_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_int_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_int_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_int_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_int_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_int_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_int_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_int_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_int_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_int_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isclose_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isclose_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isclose_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isclose_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isclose_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isclose_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isclose_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isclose_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isclose_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isclose_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isclose_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isclose_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isfinite_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isfinite_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isfinite_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isfinite_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isfinite_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isfinite_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isfinite_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isfinite_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isfinite_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isfinite_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isfinite_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isfinite_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isfinite_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isin_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isin_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isin_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isin_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isin_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isin_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isin_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isin_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isinf_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isinf_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isinf_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isinf_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isinf_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isinf_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isinf_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isinf_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isinf_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isinf_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isinf_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isinf_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isinf_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isnan_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isnan_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isnan_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isnan_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isnan_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isnan_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isnan_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isnan_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isnan_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isnan_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isnan_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isnan_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isneginf_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isneginf_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isneginf_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isneginf_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isneginf_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isneginf_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isneginf_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isneginf_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isneginf_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isneginf_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isposinf_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isposinf_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isposinf_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isposinf_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isposinf_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isposinf_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isposinf_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isposinf_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isposinf_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isposinf_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isreal_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isreal_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isreal_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isreal_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isreal_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isreal_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isreal_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isreal_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isreal_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isreal_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isreal_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isreal_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_isreal_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_istft_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_istft_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_item_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_item_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_item_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_item_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_item_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_item_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_item_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_item_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_item_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_item_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_item_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_item_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_item_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_2inputs_2outputs_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_2inputs_2outputs_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_2inputs_2outputs_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_2inputs_2outputs_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_2inputs_2outputs_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_2inputs_2outputs_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_2inputs_2outputs_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_2inputs_2outputs_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_2inputs_2outputs_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_2inputs_2outputs_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_2inputs_2outputs_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_2inputs_2outputs_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_4inputs_with_extra_args_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_4inputs_with_extra_args_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_4inputs_with_extra_args_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_4inputs_with_extra_args_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_4inputs_with_extra_args_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_4inputs_with_extra_args_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_4inputs_with_extra_args_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_4inputs_with_extra_args_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_4inputs_with_extra_args_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_4inputs_with_extra_args_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_return_by_ref_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_return_by_ref_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_return_by_ref_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_return_by_ref_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_return_by_ref_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_return_by_ref_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_return_by_ref_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_return_by_ref_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_return_by_ref_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_return_by_ref_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_binary_return_by_ref_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_unary_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_unary_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_unary_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_unary_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_unary_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_unary_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_unary_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_unary_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_unary_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_unary_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_unary_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_jiterator_unary_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kron_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kron_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kron_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kron_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kron_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kron_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kron_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kron_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kron_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kron_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kron_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kron_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kthvalue_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kthvalue_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kthvalue_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kthvalue_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kthvalue_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kthvalue_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kthvalue_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kthvalue_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_kthvalue_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lcm_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lcm_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lcm_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lcm_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lcm_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ldexp_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ldexp_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ldexp_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ldexp_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ldexp_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ldexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ldexp_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ldexp_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ldexp_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ldexp_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ldexp_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ldexp_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_le_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_le_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_le_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_le_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_le_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_le_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_le_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_le_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_le_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_le_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lerp_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lerp_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lerp_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lerp_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lerp_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lerp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lerp_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lgamma_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lgamma_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lgamma_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lgamma_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lgamma_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lgamma_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lgamma_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lgamma_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lgamma_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lgamma_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cholesky_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cholesky_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cholesky_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cholesky_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cholesky_ex_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cholesky_ex_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cholesky_ex_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cholesky_ex_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cond_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cond_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cond_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cond_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cross_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cross_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cross_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cross_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cross_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cross_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cross_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cross_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cross_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cross_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_cross_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_det_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_det_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_det_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_det_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_diagonal_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_diagonal_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_diagonal_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_diagonal_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_diagonal_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_diagonal_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_diagonal_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_diagonal_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_diagonal_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_diagonal_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_diagonal_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_diagonal_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_diagonal_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eig_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eig_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eig_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eig_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eigh_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eigh_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eigh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eigh_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eigvals_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eigvals_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eigvals_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eigvals_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eigvalsh_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eigvalsh_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eigvalsh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_eigvalsh_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_householder_product_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_householder_product_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_householder_product_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_householder_product_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_inv_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_inv_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_inv_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_inv_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_inv_ex_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_inv_ex_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_inv_ex_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_inv_ex_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_ldl_factor_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_ldl_factor_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_ldl_factor_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_ldl_factor_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_ldl_factor_ex_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_ldl_factor_ex_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_ldl_factor_ex_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_ldl_factor_ex_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_ldl_solve_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_ldl_solve_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_ldl_solve_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_ldl_solve_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lstsq_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lstsq_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lstsq_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lstsq_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lstsq_grad_oriented_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lstsq_grad_oriented_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lstsq_grad_oriented_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lstsq_grad_oriented_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lu_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lu_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lu_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lu_factor_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lu_factor_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lu_factor_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lu_factor_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lu_factor_ex_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lu_factor_ex_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lu_factor_ex_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lu_factor_ex_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lu_solve_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lu_solve_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lu_solve_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_lu_solve_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_norm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_norm_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_norm_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_norm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_norm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_power_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_power_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_power_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_power_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_rank_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_rank_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_rank_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_rank_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_rank_hermitian_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_rank_hermitian_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_rank_hermitian_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_matrix_rank_hermitian_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_multi_dot_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_multi_dot_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_multi_dot_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_multi_dot_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_multi_dot_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_multi_dot_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_norm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_norm_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_norm_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_norm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_norm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_norm_subgradients_at_zero_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_norm_subgradients_at_zero_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_norm_subgradients_at_zero_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_norm_subgradients_at_zero_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_pinv_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_pinv_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_pinv_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_pinv_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_pinv_hermitian_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_pinv_hermitian_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_pinv_hermitian_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_pinv_hermitian_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_pinv_singular_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_pinv_singular_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_pinv_singular_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_pinv_singular_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_qr_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_qr_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_qr_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_qr_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_slogdet_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_slogdet_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_slogdet_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_slogdet_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_solve_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_solve_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_solve_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_solve_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_solve_ex_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_solve_ex_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_solve_ex_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_solve_ex_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_solve_triangular_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_solve_triangular_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_solve_triangular_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_solve_triangular_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_svd_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_svd_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_svd_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_svd_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_svdvals_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_svdvals_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_svdvals_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_svdvals_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_tensorinv_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_tensorinv_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_tensorinv_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_tensorinv_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_tensorsolve_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_tensorsolve_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_tensorsolve_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_tensorsolve_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vander_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vander_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vander_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vander_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vander_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vander_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vander_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vander_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vander_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vecdot_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vecdot_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vecdot_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vecdot_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vecdot_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vecdot_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vector_norm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vector_norm_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vector_norm_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vector_norm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vector_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linalg_vector_norm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_tensor_overload_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_tensor_overload_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_tensor_overload_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_tensor_overload_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_tensor_overload_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_tensor_overload_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_tensor_overload_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_tensor_overload_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_tensor_overload_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_tensor_overload_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_linspace_tensor_overload_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log10_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log10_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log10_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log10_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log10_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log10_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log10_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log10_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log10_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log10_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log10_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log10_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log1p_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log1p_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log1p_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log1p_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log1p_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log1p_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log1p_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log1p_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log1p_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log1p_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log1p_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log1p_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log2_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log2_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log2_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log2_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log2_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log2_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log2_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log2_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log2_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log2_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log2_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_normal_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_normal_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_normal_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_normal_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_with_dtype_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_with_dtype_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_with_dtype_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_with_dtype_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_with_dtype_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_with_dtype_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_with_dtype_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_with_dtype_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_with_dtype_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_with_dtype_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_with_dtype_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_with_dtype_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_log_softmax_with_dtype_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logaddexp2_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logaddexp2_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logaddexp2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logaddexp2_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logaddexp_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logaddexp_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logaddexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logaddexp_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logcumsumexp_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logcumsumexp_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logcumsumexp_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logcumsumexp_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logcumsumexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logcumsumexp_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logdet_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logdet_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logdet_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logdet_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_and_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_and_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_and_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_and_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_and_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_and_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_and_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_and_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_and_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_and_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_and_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_and_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_not_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_not_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_not_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_not_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_not_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_not_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_not_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_not_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_not_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_not_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_not_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_not_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_or_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_or_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_or_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_or_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_or_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_or_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_or_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_or_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_or_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_or_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_or_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_or_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_xor_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_xor_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_xor_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_xor_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_xor_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_xor_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_xor_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_xor_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_xor_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_xor_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_xor_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logical_xor_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logit_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logit_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logit_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logit_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logit_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logit_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logit_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logit_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logit_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logit_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_tensor_overload_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_tensor_overload_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_tensor_overload_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_tensor_overload_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_tensor_overload_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_tensor_overload_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_tensor_overload_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_tensor_overload_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_tensor_overload_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_tensor_overload_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logspace_tensor_overload_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logsumexp_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logsumexp_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logsumexp_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logsumexp_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logsumexp_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logsumexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logsumexp_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logsumexp_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logsumexp_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logsumexp_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logsumexp_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_logsumexp_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_long_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_long_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_long_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_long_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_long_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_long_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_long_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_long_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_long_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_long_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_long_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_long_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_long_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lt_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lt_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lt_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lt_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lt_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lt_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lt_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lt_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lt_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lt_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lu_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lu_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lu_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lu_solve_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lu_solve_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lu_solve_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lu_solve_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lu_unpack_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lu_unpack_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lu_unpack_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_lu_unpack_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mH_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mH_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mH_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mH_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mH_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mH_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mH_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mH_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mH_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mH_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mH_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mH_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mH_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mT_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mT_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mT_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mT_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mT_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mT_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mT_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mT_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mT_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mT_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mT_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mT_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mT_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amax_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amax_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amax_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amax_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amax_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amax_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amax_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amax_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amin_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amin_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amin_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amin_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amin_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amin_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amin_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_amin_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmax_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmax_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmax_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmax_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmax_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmax_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmax_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmax_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmin_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmin_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmin_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmin_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmin_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmin_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmin_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_argmin_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumprod_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumprod_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumprod_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumprod_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumprod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumprod_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumprod_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumprod_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumprod_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumprod_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumprod_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumsum_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumsum_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumsum_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumsum_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumsum_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumsum_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumsum_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumsum_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumsum_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumsum_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_cumsum_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_fill_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_fill_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_fill_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_fill_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_fill_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_fill_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_fill_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_fill_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_fill_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_fill_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_fill_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_fill_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_fill_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_log_softmax_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_log_softmax_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_log_softmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_log_softmax_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_logaddexp_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_logaddexp_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_logaddexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_logaddexp_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_logsumexp_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_logsumexp_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_logsumexp_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_logsumexp_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_logsumexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_logsumexp_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_logsumexp_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_logsumexp_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_logsumexp_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_logsumexp_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_logsumexp_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_mean_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_mean_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_mean_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_mean_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_mean_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_mean_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_median_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_median_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_median_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_median_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_norm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_norm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_norm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_normalize_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_normalize_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_normalize_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_normalize_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_normalize_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_normalize_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_prod_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_prod_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_prod_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_prod_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_prod_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_prod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_prod_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_prod_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_prod_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_prod_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_prod_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_prod_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_scatter_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_scatter_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_scatter_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_scatter_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_scatter_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_scatter_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_scatter_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_scatter_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_scatter_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_scatter_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_scatter_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_scatter_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_select_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_select_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_select_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_select_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_select_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_select_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_select_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_select_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_select_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_select_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_select_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_select_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_softmax_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_softmax_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_softmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_softmax_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_softmin_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_softmin_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_softmin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_softmin_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_std_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_std_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_std_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_std_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_std_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_std_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_std_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_std_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_std_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_std_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_std_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_sum_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_sum_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_sum_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_sum_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_sum_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_sum_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_sum_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_sum_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_sum_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_sum_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_sum_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_sum_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_var_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_var_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_var_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_var_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_var_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_var_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_var_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_var_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_var_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_var_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_masked_var_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_matrix_exp_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_matrix_exp_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_matrix_exp_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_matrix_exp_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_matrix_exp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_matrix_exp_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_binary_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_binary_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_binary_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_binary_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_binary_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_binary_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_binary_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_binary_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_binary_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_binary_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_pool2d_with_indices_backward_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_pool2d_with_indices_backward_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_pool2d_with_indices_backward_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_pool2d_with_indices_backward_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_no_dim_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_no_dim_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_no_dim_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_no_dim_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_no_dim_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_no_dim_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_no_dim_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_no_dim_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_no_dim_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_no_dim_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_with_dim_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_with_dim_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_with_dim_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_with_dim_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_with_dim_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_with_dim_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_with_dim_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_with_dim_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_with_dim_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_max_reduction_with_dim_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_maximum_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_maximum_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_maximum_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_maximum_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_maximum_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_maximum_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_maximum_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_maximum_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_maximum_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_maximum_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mean_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mean_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mean_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mean_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mean_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mean_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_median_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_median_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_median_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_median_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_median_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_median_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_median_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_median_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_median_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_list_of_tensors_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_list_of_tensors_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_list_of_tensors_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_list_of_tensors_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_list_of_tensors_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_list_of_tensors_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_list_of_tensors_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_list_of_tensors_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_list_of_tensors_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_list_of_tensors_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_list_of_tensors_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_list_of_tensors_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_variadic_tensors_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_variadic_tensors_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_variadic_tensors_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_variadic_tensors_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_variadic_tensors_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_variadic_tensors_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_variadic_tensors_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_variadic_tensors_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_variadic_tensors_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_variadic_tensors_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_variadic_tensors_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_meshgrid_variadic_tensors_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_binary_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_binary_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_binary_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_binary_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_binary_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_binary_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_binary_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_binary_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_binary_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_binary_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_no_dim_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_no_dim_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_no_dim_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_no_dim_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_no_dim_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_no_dim_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_no_dim_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_no_dim_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_no_dim_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_no_dim_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_with_dim_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_with_dim_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_with_dim_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_with_dim_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_with_dim_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_with_dim_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_with_dim_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_with_dim_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_with_dim_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_min_reduction_with_dim_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_minimum_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_minimum_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_minimum_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_minimum_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_minimum_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_minimum_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_minimum_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_minimum_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_minimum_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_minimum_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mm_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mm_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mode_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mode_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mode_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mode_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mode_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mode_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mode_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mode_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mode_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mode_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_movedim_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_movedim_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_movedim_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_movedim_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_movedim_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_movedim_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_movedim_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_movedim_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_movedim_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_movedim_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_movedim_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_movedim_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_movedim_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_msort_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_msort_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_msort_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_msort_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_msort_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_msort_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_msort_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_msort_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_msort_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_msort_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mul_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mul_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mul_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mul_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mul_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mul_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mul_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mul_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mul_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mul_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mul_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mul_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mul_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_multinomial_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_multinomial_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_multinomial_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_multinomial_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mv_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mv_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mv_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mv_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mv_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mv_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nan_to_num_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nan_to_num_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nan_to_num_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nan_to_num_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nan_to_num_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nan_to_num_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nan_to_num_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nan_to_num_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nan_to_num_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nan_to_num_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanmean_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanmean_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanmean_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanmean_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanmean_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanmean_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanmean_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanmedian_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanmedian_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanmedian_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanmedian_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanmedian_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanmedian_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanmedian_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanmedian_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanmedian_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanquantile_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nanquantile_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nansum_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nansum_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nansum_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nansum_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nansum_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nansum_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nansum_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nansum_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nansum_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nansum_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nansum_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nansum_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nansum_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_copy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_copy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_copy_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_copy_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_copy_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_copy_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_copy_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_copy_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_narrow_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_native_batch_norm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_native_batch_norm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_native_batch_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_native_batch_norm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_native_dropout_backward_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_native_dropout_backward_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_native_dropout_backward_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_native_dropout_backward_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_native_layer_norm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_native_layer_norm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_native_layer_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_native_layer_norm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ne_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ne_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ne_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ne_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ne_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ne_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ne_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ne_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ne_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ne_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ne_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ne_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_neg_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_neg_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_neg_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_neg_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_neg_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_neg_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_neg_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_neg_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_neg_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_neg_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_neg_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_neg_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_strided_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_strided_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_strided_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_strided_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_strided_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_strided_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_strided_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_strided_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_strided_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_strided_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_strided_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_strided_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_empty_strided_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_full_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_full_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_full_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_full_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_full_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_full_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_full_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_full_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_full_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_full_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_full_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_full_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_full_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_ones_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_ones_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_ones_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_ones_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_ones_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_ones_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_ones_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_ones_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_ones_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_ones_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_ones_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_ones_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_ones_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_zeros_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_zeros_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_zeros_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_zeros_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_zeros_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_zeros_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_zeros_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_zeros_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_zeros_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_zeros_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_zeros_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_zeros_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_new_zeros_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nextafter_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nextafter_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nextafter_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nextafter_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_avg_pool1d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_avg_pool1d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_avg_pool2d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_avg_pool2d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_avg_pool3d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_avg_pool3d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_max_pool1d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_max_pool1d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_max_pool2d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_max_pool2d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_max_pool3d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_max_pool3d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_alpha_dropout_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_alpha_dropout_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_alpha_dropout_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_alpha_dropout_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_avg_pool1d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_avg_pool1d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_avg_pool1d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_avg_pool1d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_avg_pool2d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_avg_pool2d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_avg_pool2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_avg_pool2d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_avg_pool3d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_avg_pool3d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_avg_pool3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_avg_pool3d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_batch_norm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_batch_norm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_batch_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_batch_norm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_batch_norm_without_cudnn_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_batch_norm_without_cudnn_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_bilinear_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_bilinear_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_bilinear_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_bilinear_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_binary_cross_entropy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_binary_cross_entropy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_binary_cross_entropy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_binary_cross_entropy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_binary_cross_entropy_with_logits_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_celu_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_celu_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_celu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_celu_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_channel_shuffle_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_channel_shuffle_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_channel_shuffle_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_channel_shuffle_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_channel_shuffle_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_channel_shuffle_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_channel_shuffle_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_channel_shuffle_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_channel_shuffle_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_channel_shuffle_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_channel_shuffle_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_channel_shuffle_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv1d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv1d_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv1d_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv1d_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv1d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv1d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv1d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv2d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv2d_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv2d_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv2d_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv2d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv2d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv3d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv3d_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv3d_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv3d_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv3d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv3d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose1d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose1d_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose1d_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose1d_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose1d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose1d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose1d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose2d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose2d_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose2d_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose2d_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose2d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose2d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose3d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose3d_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose3d_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose3d_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose3d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_conv_transpose3d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cosine_embedding_loss_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cosine_embedding_loss_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cosine_embedding_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cosine_embedding_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cosine_embedding_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cosine_embedding_loss_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cosine_embedding_loss_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cosine_embedding_loss_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cosine_embedding_loss_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cosine_embedding_loss_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cosine_similarity_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cosine_similarity_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cosine_similarity_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cosine_similarity_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cross_entropy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cross_entropy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cross_entropy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_cross_entropy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_ctc_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_ctc_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_dropout2d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_dropout2d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_dropout2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_dropout2d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_dropout3d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_dropout3d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_dropout3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_dropout3d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_dropout_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_dropout_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_dropout_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_dropout_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_elu_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_elu_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_elu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_elu_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_embedding_bag_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_embedding_bag_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_embedding_bag_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_embedding_bag_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_embedding_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_embedding_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_embedding_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_embedding_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_feature_alpha_dropout_with_train_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_feature_alpha_dropout_with_train_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_fractional_max_pool2d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_fractional_max_pool2d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_fractional_max_pool2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_fractional_max_pool2d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_fractional_max_pool3d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_fractional_max_pool3d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_fractional_max_pool3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_fractional_max_pool3d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_gaussian_nll_loss_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_gaussian_nll_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_gaussian_nll_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_gaussian_nll_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_gelu_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_gelu_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_gelu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_gelu_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_glu_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_glu_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_glu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_glu_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_grid_sample_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_grid_sample_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_grid_sample_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_grid_sample_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_group_norm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_group_norm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_group_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_group_norm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardshrink_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardshrink_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardshrink_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardshrink_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardsigmoid_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardsigmoid_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardsigmoid_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardsigmoid_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardswish_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardswish_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardswish_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardswish_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardtanh_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardtanh_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardtanh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardtanh_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardtanh_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardtanh_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardtanh_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hardtanh_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hinge_embedding_loss_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hinge_embedding_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hinge_embedding_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_hinge_embedding_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_huber_loss_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_huber_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_huber_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_huber_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_instance_norm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_instance_norm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_instance_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_instance_norm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_area_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_area_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_area_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_area_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_bicubic_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_bicubic_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_bicubic_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_bicubic_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_bilinear_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_bilinear_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_bilinear_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_bilinear_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_linear_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_linear_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_linear_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_linear_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_nearest-exact_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_nearest-exact_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_nearest-exact_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_nearest_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_nearest_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_nearest_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_nearest_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_nearest_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_trilinear_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_trilinear_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_trilinear_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_interpolate_trilinear_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_kl_div_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_kl_div_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_kl_div_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_kl_div_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_l1_loss_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_l1_loss_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_l1_loss_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_l1_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_l1_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_l1_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_layer_norm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_layer_norm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_layer_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_layer_norm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_leaky_relu_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_leaky_relu_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_leaky_relu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_leaky_relu_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_linear_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_linear_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_linear_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_linear_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_linear_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_linear_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_local_response_norm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_local_response_norm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_local_response_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_local_response_norm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_logsigmoid_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_logsigmoid_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_logsigmoid_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_logsigmoid_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_margin_ranking_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_margin_ranking_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_margin_ranking_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_margin_ranking_loss_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_margin_ranking_loss_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_margin_ranking_loss_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_margin_ranking_loss_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_margin_ranking_loss_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_pool1d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_pool1d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_pool1d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_pool1d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_pool2d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_pool2d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_pool2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_pool2d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_pool3d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_pool3d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_pool3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_pool3d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool1d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool1d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool1d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool1d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool1d_grad_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool1d_grad_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool1d_grad_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool1d_grad_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool2d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool2d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool2d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool2d_grad_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool2d_grad_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool2d_grad_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool2d_grad_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool3d_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool3d_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool3d_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool3d_grad_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool3d_grad_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool3d_grad_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_max_unpool3d_grad_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_mish_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_mish_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_mish_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_mish_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_mse_loss_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_mse_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_mse_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_mse_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multi_head_attention_forward_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multi_head_attention_forward_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multi_head_attention_forward_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multi_head_attention_forward_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multi_margin_loss_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multi_margin_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multi_margin_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multi_margin_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multilabel_margin_loss_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multilabel_margin_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multilabel_margin_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multilabel_margin_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multilabel_soft_margin_loss_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multilabel_soft_margin_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_nll_loss_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_nll_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_nll_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_nll_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_normalize_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_normalize_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_normalize_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_normalize_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_normalize_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_normalize_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_one_hot_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_circular_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_circular_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_circular_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_circular_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_circular_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_circular_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_circular_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_circular_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_circular_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_circular_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_circular_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_circular_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_constant_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_constant_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_constant_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_constant_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_constant_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_constant_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_constant_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_constant_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_constant_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_constant_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_constant_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_constant_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_reflect_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_reflect_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_reflect_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_reflect_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_reflect_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_reflect_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_reflect_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_reflect_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_reflect_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_reflect_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_reflect_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_negative_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_negative_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_negative_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_negative_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_negative_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_negative_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_negative_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_negative_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_negative_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_negative_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pad_replicate_negative_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pairwise_distance_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pairwise_distance_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pairwise_distance_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pairwise_distance_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pairwise_distance_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pairwise_distance_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pairwise_distance_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pairwise_distance_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pairwise_distance_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pairwise_distance_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pairwise_distance_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pdist_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pdist_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_shuffle_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_shuffle_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_shuffle_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_shuffle_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_shuffle_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_shuffle_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_shuffle_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_shuffle_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_shuffle_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_shuffle_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_shuffle_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_unshuffle_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_unshuffle_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_unshuffle_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_unshuffle_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_unshuffle_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_unshuffle_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_unshuffle_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_unshuffle_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_unshuffle_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_unshuffle_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_pixel_unshuffle_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_poisson_nll_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_poisson_nll_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_poisson_nll_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_poisson_nll_loss_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_poisson_nll_loss_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_poisson_nll_loss_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_poisson_nll_loss_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_poisson_nll_loss_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_prelu_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_prelu_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_prelu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_prelu_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu6_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu6_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu6_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu6_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu6_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu6_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu6_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu6_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu6_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_relu_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_rms_norm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_rms_norm_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_rms_norm_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_rms_norm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_rms_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_rms_norm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_rrelu_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_rrelu_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_rrelu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_rrelu_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_scaled_dot_product_attention_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_scaled_dot_product_attention_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_selu_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_selu_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_selu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_selu_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_silu_complex_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_silu_complex_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_silu_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_silu_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_silu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_silu_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_smooth_l1_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_smooth_l1_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_smooth_l1_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_soft_margin_loss_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_soft_margin_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_soft_margin_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_soft_margin_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_with_dtype_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_with_dtype_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_with_dtype_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_with_dtype_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_with_dtype_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_with_dtype_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_with_dtype_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_with_dtype_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_with_dtype_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softmin_with_dtype_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softplus_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softplus_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softplus_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softplus_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softshrink_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softshrink_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softshrink_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softshrink_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softsign_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softsign_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softsign_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softsign_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softsign_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softsign_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softsign_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softsign_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softsign_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softsign_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softsign_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_softsign_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_tanhshrink_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_tanhshrink_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_tanhshrink_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_tanhshrink_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_tanhshrink_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_tanhshrink_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_tanhshrink_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_tanhshrink_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_tanhshrink_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_tanhshrink_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_tanhshrink_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_threshold_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_threshold_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_threshold_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_threshold_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_threshold_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_threshold_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_threshold_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_threshold_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_threshold_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_loss_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_loss_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_loss_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_loss_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_loss_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_loss_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_loss_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_loss_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_unfold_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_unfold_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_unfold_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_unfold_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_unfold_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_unfold_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_unfold_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_upsample_bilinear_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_upsample_bilinear_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_upsample_bilinear_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_upsample_bilinear_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_upsample_nearest_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_upsample_nearest_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_upsample_nearest_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_upsample_nearest_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nn_functional_upsample_nearest_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_static_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_static_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_static_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_static_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_static_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_static_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_static_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_static_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_static_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_static_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_static_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_static_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_nonzero_static_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_fro_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_fro_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_fro_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_fro_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_fro_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_fro_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_inf_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_inf_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_inf_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_inf_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_inf_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_inf_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_nuc_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_nuc_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_nuc_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_norm_nuc_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_normal_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_normal_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_normal_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_normal_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_normal_in_place_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_normal_in_place_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_normal_in_place_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_normal_in_place_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_normal_in_place_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_normal_in_place_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_normal_number_mean_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_normal_number_mean_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_normal_number_mean_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_normal_number_mean_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_like_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_like_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_like_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_like_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_like_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_like_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_like_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_like_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_like_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_like_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_like_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_like_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ones_like_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ormqr_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ormqr_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ormqr_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ormqr_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_outer_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_outer_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_outer_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_outer_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_outer_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_outer_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_outer_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_outer_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_outer_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_outer_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_outer_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_outer_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pca_lowrank_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pca_lowrank_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pca_lowrank_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pca_lowrank_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_copy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_copy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_copy_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_copy_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_copy_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_copy_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_copy_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_copy_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_permute_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pinverse_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pinverse_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pinverse_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pinverse_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polar_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polar_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_0_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_0_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_0_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_0_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_0_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_0_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_0_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_0_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_0_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_1_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_1_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_1_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_1_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_1_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_1_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_1_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_1_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_1_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_2_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_2_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_2_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_2_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_2_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_2_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_2_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_2_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_2_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_3_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_3_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_3_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_3_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_3_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_3_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_3_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_3_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_3_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_3_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_4_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_4_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_4_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_4_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_4_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_4_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_4_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_4_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_4_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_polygamma_polygamma_n_4_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_positive_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_positive_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_positive_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_positive_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_positive_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_positive_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_positive_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_positive_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_positive_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_positive_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_positive_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_positive_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pow_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pow_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pow_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pow_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pow_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pow_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pow_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pow_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pow_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pow_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pow_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_pow_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_prod_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_prod_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_prod_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_prod_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_prod_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_prod_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_prod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_prod_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_prod_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_prod_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_prod_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_prod_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_prod_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_put_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_put_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_put_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_put_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_put_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_put_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_put_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_put_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_put_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_put_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_put_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_put_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_qr_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_qr_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_qr_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_qr_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_quantile_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_quantile_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rad2deg_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rad2deg_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rad2deg_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rad2deg_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rad2deg_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rad2deg_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rad2deg_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rad2deg_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rad2deg_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rad2deg_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rand_like_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rand_like_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rand_like_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rand_like_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rand_like_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rand_like_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rand_like_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_like_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_like_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_like_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_like_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_like_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_like_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_like_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_like_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randint_like_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randn_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randn_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randn_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randn_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randn_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randn_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randn_like_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randn_like_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randn_like_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randn_like_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randn_like_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randn_like_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_randn_like_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ravel_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ravel_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ravel_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ravel_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ravel_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ravel_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ravel_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ravel_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ravel_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ravel_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ravel_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ravel_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_ravel_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_real_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reciprocal_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reciprocal_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reciprocal_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reciprocal_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reciprocal_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reciprocal_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reciprocal_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reciprocal_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reciprocal_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reciprocal_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reciprocal_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reciprocal_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_remainder_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_remainder_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_remainder_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_remainder_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_remainder_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_remainder_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_remainder_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_remainder_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_remainder_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_renorm_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_renorm_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_renorm_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_renorm_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_renorm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_renorm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_interleave_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_interleave_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_interleave_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_interleave_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_interleave_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_interleave_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_interleave_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_interleave_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_interleave_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_interleave_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_interleave_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_interleave_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_repeat_interleave_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_as_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_as_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_as_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_as_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_as_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_as_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_as_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_as_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_as_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_as_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_as_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_as_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_as_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_reshape_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize__cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize__cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize__cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize__cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize__cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize__cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize__cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize__cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize__cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize__cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize__cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize__cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize_as__cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize_as__cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize_as__cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize_as__cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize_as__cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize_as__cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize_as__cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize_as__cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize_as__cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize_as__cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize_as__cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resize_as__cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_conj_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_conj_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_conj_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_conj_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_conj_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_conj_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_conj_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_conj_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_conj_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_conj_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_conj_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_conj_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_neg_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_neg_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_neg_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_neg_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_neg_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_neg_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_neg_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_neg_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_neg_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_neg_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_neg_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_neg_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_resolve_neg_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_roll_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_roll_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_roll_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_roll_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_roll_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_roll_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_roll_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_roll_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_roll_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_roll_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_roll_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_roll_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_roll_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rot90_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rot90_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rot90_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rot90_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rot90_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rot90_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rot90_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rot90_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rot90_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rot90_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rot90_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rot90_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_decimals_0_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_decimals_0_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_decimals_0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_decimals_0_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_decimals_3_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_decimals_3_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_decimals_3_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_decimals_3_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_decimals_neg_3_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_decimals_neg_3_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_decimals_neg_3_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_round_decimals_neg_3_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsqrt_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsqrt_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsqrt_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsqrt_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsqrt_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsqrt_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsqrt_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsqrt_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsqrt_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsqrt_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsqrt_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsqrt_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsqrt_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsub_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsub_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsub_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsub_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsub_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsub_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsub_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsub_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsub_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsub_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_rsub_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scalar_tensor_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scalar_tensor_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scalar_tensor_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scalar_tensor_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scalar_tensor_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scalar_tensor_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scalar_tensor_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scalar_tensor_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scalar_tensor_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scalar_tensor_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scalar_tensor_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scalar_tensor_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scalar_tensor_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_add_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_add_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_add_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_add_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_add_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_add_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_add_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_add_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_add_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_add_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_add_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_add_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amax_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amax_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amax_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amax_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amax_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amax_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amax_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amax_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amin_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amin_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amin_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amin_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amin_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amin_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amin_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_amin_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_mean_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_mean_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_mean_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_mean_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_mean_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_mean_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_mean_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_mean_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_mean_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_prod_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_prod_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_prod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_prod_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_prod_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_prod_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_prod_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_prod_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_prod_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_sum_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_sum_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_sum_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_sum_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_sum_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_sum_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_sum_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_sum_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_sum_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_scatter_reduce_sum_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_searchsorted_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_searchsorted_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_searchsorted_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_searchsorted_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_searchsorted_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_searchsorted_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_searchsorted_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_searchsorted_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_searchsorted_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_scatter_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_scatter_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_scatter_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_scatter_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_scatter_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_scatter_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_scatter_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_scatter_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_scatter_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_select_scatter_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sgn_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sgn_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sgn_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sgn_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sgn_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sgn_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sgn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sgn_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sgn_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sgn_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sgn_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sgn_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sgn_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_short_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_short_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_short_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_short_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_short_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_short_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_short_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_short_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_short_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_short_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_short_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_short_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sigmoid_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sigmoid_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sigmoid_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sigmoid_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sigmoid_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sigmoid_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sigmoid_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sigmoid_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sigmoid_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sigmoid_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sigmoid_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sigmoid_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sigmoid_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sign_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sign_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sign_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sign_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sign_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sign_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sign_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sign_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sign_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sign_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_bartlett_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_bartlett_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_blackman_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_blackman_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_cosine_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_cosine_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_exponential_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_exponential_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_gaussian_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_gaussian_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_general_cosine_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_general_cosine_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_general_hamming_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_general_hamming_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_hamming_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_hamming_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_hann_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_hann_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_kaiser_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_kaiser_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_nuttall_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signal_windows_nuttall_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signbit_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signbit_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signbit_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signbit_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signbit_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signbit_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signbit_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signbit_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signbit_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_signbit_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sin_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sin_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sin_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sin_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sin_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sin_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sin_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sin_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sin_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sin_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sin_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sin_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinc_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinc_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinc_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinc_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinc_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinc_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinc_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinc_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinc_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinc_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinc_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinc_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinh_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinh_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinh_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinh_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinh_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinh_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinh_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinh_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinh_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinh_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinh_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sinh_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_scatter_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_scatter_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_scatter_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_scatter_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_scatter_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_scatter_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_scatter_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_scatter_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_scatter_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_slice_scatter_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_softmax_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_softmax_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_softmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_softmax_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_softmax_with_dtype_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_softmax_with_dtype_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_softmax_with_dtype_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_softmax_with_dtype_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_softmax_with_dtype_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_softmax_with_dtype_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_softmax_with_dtype_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_softmax_with_dtype_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_softmax_with_dtype_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_softmax_with_dtype_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_softmax_with_dtype_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_softmax_with_dtype_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sort_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sort_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sort_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sort_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sort_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sort_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sort_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sort_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sort_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sort_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sparse_mm_reduce_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sparse_mm_reduce_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sparse_mm_reduce_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sparse_mm_reduce_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sparse_sampled_addmm_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sparse_sampled_addmm_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sparse_sampled_addmm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sparse_sampled_addmm_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_airy_ai_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_airy_ai_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_airy_ai_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_airy_ai_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_airy_ai_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_airy_ai_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_airy_ai_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_airy_ai_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j0_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j0_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j0_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j0_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j0_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j0_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j0_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j1_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j1_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j1_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j1_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j1_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j1_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_j1_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y0_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y0_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y0_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y0_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y0_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y0_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y0_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y1_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y1_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y1_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y1_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y1_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y1_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_bessel_y1_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_t_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_t_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_t_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_t_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_t_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_t_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_t_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_t_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_u_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_u_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_u_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_u_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_u_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_u_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_u_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_u_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_v_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_v_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_v_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_v_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_v_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_v_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_v_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_v_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_w_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_w_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_w_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_w_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_w_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_w_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_w_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_chebyshev_polynomial_w_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_entr_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_entr_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_entr_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_entr_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_entr_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_entr_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_entr_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_entr_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_entr_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_entr_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_erfcx_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_erfcx_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_erfcx_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_erfcx_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_erfcx_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_erfcx_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_erfcx_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_erfcx_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_hermite_polynomial_h_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_hermite_polynomial_h_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_hermite_polynomial_h_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_hermite_polynomial_h_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_hermite_polynomial_h_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_hermite_polynomial_h_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_hermite_polynomial_h_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_hermite_polynomial_h_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_hermite_polynomial_he_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_hermite_polynomial_he_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_hermite_polynomial_he_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_hermite_polynomial_he_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_hermite_polynomial_he_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_hermite_polynomial_he_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_hermite_polynomial_he_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_hermite_polynomial_he_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i0e_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i0e_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i0e_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i0e_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i0e_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i0e_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i0e_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i0e_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i0e_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i0e_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1e_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1e_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1e_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1e_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1e_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1e_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1e_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1e_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1e_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_i1e_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_laguerre_polynomial_l_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_laguerre_polynomial_l_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_laguerre_polynomial_l_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_laguerre_polynomial_l_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_laguerre_polynomial_l_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_laguerre_polynomial_l_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_laguerre_polynomial_l_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_laguerre_polynomial_l_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_legendre_polynomial_p_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_legendre_polynomial_p_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_legendre_polynomial_p_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_legendre_polynomial_p_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_legendre_polynomial_p_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_legendre_polynomial_p_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_legendre_polynomial_p_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_legendre_polynomial_p_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_log_ndtr_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_log_ndtr_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_log_ndtr_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_log_ndtr_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_log_ndtr_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_log_ndtr_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_log_ndtr_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_log_ndtr_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i0_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i0_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i0_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i0_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i0_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i0_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i0_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i1_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i1_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i1_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i1_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i1_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i1_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_i1_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k0_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k0_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k0_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k0_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k0_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k0_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k0_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k1_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k1_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k1_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k1_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k1_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k1_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_modified_bessel_k1_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtr_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtr_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtr_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtr_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtr_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtr_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtr_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtr_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtr_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtr_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtri_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtri_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtri_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtri_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtri_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtri_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtri_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_ndtri_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k0_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k0_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k0_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k0_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k0_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k0_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k0_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k1_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k1_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k1_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k1_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k1_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k1_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_scaled_modified_bessel_k1_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_spherical_bessel_j0_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_spherical_bessel_j0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_spherical_bessel_j0_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_spherical_bessel_j0_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_spherical_bessel_j0_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_spherical_bessel_j0_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_spherical_bessel_j0_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_spherical_bessel_j0_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_xlog1py_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_xlog1py_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_xlog1py_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_xlog1py_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_xlog1py_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_xlog1py_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_xlog1py_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_xlog1py_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_xlog1py_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_xlog1py_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_zeta_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_zeta_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_zeta_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_zeta_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_zeta_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_zeta_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_zeta_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_special_zeta_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_list_args_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_list_args_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_list_args_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_list_args_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_list_args_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_list_args_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_list_args_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_list_args_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_list_args_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_list_args_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_list_args_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_list_args_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_copy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_copy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_copy_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_copy_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_copy_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_copy_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_copy_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_copy_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_split_with_sizes_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sqrt_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sqrt_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sqrt_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sqrt_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sqrt_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sqrt_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sqrt_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sqrt_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sqrt_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sqrt_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sqrt_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sqrt_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sqrt_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_square_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_square_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_square_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_square_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_square_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_square_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_square_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_square_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_square_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_square_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_square_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_square_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_copy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_copy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_copy_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_copy_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_copy_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_copy_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_copy_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_copy_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_multiple_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_multiple_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_multiple_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_multiple_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_multiple_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_multiple_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_multiple_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_multiple_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_multiple_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_multiple_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_multiple_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_multiple_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_squeeze_multiple_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stack_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stack_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stack_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stack_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stack_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stack_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stack_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stack_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stack_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stack_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stack_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stack_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stack_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_mean_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_mean_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_mean_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_mean_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_mean_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_mean_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_mean_unbiased_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_mean_unbiased_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_mean_unbiased_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_mean_unbiased_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_mean_unbiased_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_mean_unbiased_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_unbiased_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_unbiased_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_unbiased_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_unbiased_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_unbiased_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_std_unbiased_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stft_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stft_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stft_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_stft_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sub_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sub_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sub_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sub_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sub_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sub_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sub_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sub_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sub_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sub_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sub_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sub_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_to_size_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_to_size_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_to_size_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_to_size_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_to_size_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_to_size_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_to_size_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_to_size_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_to_size_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_to_size_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_to_size_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_sum_to_size_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_svd_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_svd_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_svd_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_svd_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_svd_lowrank_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_svd_lowrank_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_svd_lowrank_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_svd_lowrank_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_copy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_copy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_copy_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_copy_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_copy_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_copy_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_copy_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_t_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_along_dim_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_along_dim_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_along_dim_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_along_dim_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_along_dim_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_along_dim_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_along_dim_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_along_dim_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_along_dim_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_along_dim_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_along_dim_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_along_dim_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_take_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tan_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tan_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tan_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tan_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tan_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tan_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tan_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tan_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tan_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tan_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tan_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tan_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tan_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tanh_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tanh_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tanh_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tanh_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tanh_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tanh_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tanh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tanh_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tanh_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tanh_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tanh_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tanh_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tanh_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensor_split_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensor_split_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensor_split_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensor_split_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensor_split_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensor_split_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensor_split_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensor_split_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensor_split_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensor_split_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensor_split_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensor_split_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensordot_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensordot_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensordot_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensordot_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensordot_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tensordot_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tile_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tile_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tile_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tile_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tile_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tile_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tile_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tile_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tile_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tile_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tile_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tile_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_sparse_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_sparse_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_sparse_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_sparse_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_sparse_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_sparse_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_sparse_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_sparse_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_sparse_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_sparse_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_sparse_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_to_sparse_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_topk_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_topk_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_topk_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_topk_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_topk_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_topk_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_topk_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_topk_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_topk_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_torch__scaled_mm_cuda_float8_e4m3fn, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_torch_ops_aten__efficient_attention_forward_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_torch_ops_aten__efficient_attention_forward_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_torch_ops_aten__flash_attention_forward_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_torch_ops_aten__safe_softmax_default_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_torch_ops_aten__safe_softmax_default_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_torch_ops_aten__safe_softmax_default_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_torch_ops_aten__safe_softmax_default_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_torch_ops_aten__safe_softmax_default_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_torch_ops_aten__safe_softmax_default_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trace_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trace_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trace_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trace_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trace_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trace_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trace_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trace_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trace_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trace_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trace_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trace_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trace_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_copy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_copy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_copy_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_copy_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_copy_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_copy_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_copy_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_copy_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_transpose_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapezoid_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapezoid_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapezoid_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapezoid_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapezoid_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapezoid_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapezoid_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapezoid_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapezoid_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapezoid_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapezoid_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapz_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapz_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapz_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapz_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapz_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapz_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapz_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapz_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapz_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapz_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trapz_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triangular_solve_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triangular_solve_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triangular_solve_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triangular_solve_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_indices_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_tril_indices_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triu_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triu_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triu_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triu_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triu_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triu_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triu_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triu_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triu_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triu_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triu_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triu_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triu_indices_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_triu_indices_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_true_divide_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_true_divide_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_true_divide_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_true_divide_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_true_divide_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_true_divide_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_true_divide_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_true_divide_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_true_divide_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_true_divide_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_true_divide_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_true_divide_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_true_divide_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trunc_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trunc_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trunc_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trunc_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trunc_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trunc_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trunc_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trunc_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_trunc_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_copy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_copy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_copy_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_copy_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_copy_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_copy_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_copy_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_copy_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unbind_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unflatten_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unflatten_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unflatten_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unflatten_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unflatten_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unflatten_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unflatten_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unflatten_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unflatten_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unflatten_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unflatten_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unflatten_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unflatten_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_copy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_copy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_copy_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_copy_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_copy_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_copy_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_copy_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_copy_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unfold_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_uniform_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_uniform_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_uniform_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_uniform_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_uniform_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_uniform_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_consecutive_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_consecutive_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_consecutive_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_consecutive_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_consecutive_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_consecutive_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_consecutive_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_consecutive_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_consecutive_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_consecutive_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_cuda_uint16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_cuda_uint32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_cuda_uint64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unique_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unravel_index_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unravel_index_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unravel_index_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unravel_index_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unravel_index_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_chunk_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_chunk_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_chunk_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_chunk_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_chunk_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_chunk_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_chunk_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_chunk_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_chunk_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_chunk_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_chunk_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_chunk_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_chunk_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_split_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_split_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_split_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_split_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_split_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_split_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_split_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_split_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_split_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_split_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_split_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_split_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsafe_split_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_copy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_copy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_copy_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_copy_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_copy_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_copy_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_copy_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_copy_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_unsqueeze_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_mean_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_mean_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_mean_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_mean_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_mean_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_mean_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_mean_unbiased_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_mean_unbiased_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_mean_unbiased_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_mean_unbiased_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_mean_unbiased_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_mean_unbiased_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_unbiased_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_unbiased_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_unbiased_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_unbiased_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_unbiased_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_var_unbiased_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vdot_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vdot_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vdot_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vdot_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vdot_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vdot_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_complex_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_complex_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_complex_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_real_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_as_real_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_copy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_copy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_copy_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_copy_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_copy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_copy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_copy_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_copy_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_copy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_copy_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_copy_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_view_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vsplit_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vsplit_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vsplit_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vsplit_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vsplit_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vsplit_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vsplit_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vsplit_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vsplit_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vsplit_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vsplit_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vsplit_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vsplit_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vstack_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vstack_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vstack_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vstack_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vstack_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vstack_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vstack_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vstack_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vstack_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vstack_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vstack_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vstack_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_vstack_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_where_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_where_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_where_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_where_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_where_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_where_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_where_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_where_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_where_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_where_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_where_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_where_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_where_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_xlogy_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_xlogy_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_xlogy_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_xlogy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_xlogy_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_xlogy_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_xlogy_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_xlogy_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_xlogy_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_xlogy_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zero__cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zero__cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zero__cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zero__cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zero__cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zero__cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zero__cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zero__cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zero__cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zero__cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zero__cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zero__cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_like_cuda_bfloat16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_like_cuda_bool, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_like_cuda_complex128, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_like_cuda_complex32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_like_cuda_complex64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_like_cuda_float16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_like_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_like_cuda_float64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_like_cuda_int16, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_like_cuda_int32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_like_cuda_int64, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_like_cuda_int8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_nnc_correctness_zeros_like_cuda_uint8, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_H_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_T_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported___getitem___cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported___rpow___cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported___rsub___cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported__batch_norm_with_update_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported__chunk_cat_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported__native_batch_norm_legit_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported__segment_reduce_lengths_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported__segment_reduce_offsets_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported__softmax_backward_data_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported__unsafe_masked_index_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported__unsafe_masked_index_put_accumulate_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported__upsample_bilinear2d_aa_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_acosh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_addbmm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_addcdiv_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_addmm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_addmv_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_addr_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_alias_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_all_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_allclose_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_amax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_amin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_aminmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_angle_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_any_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_arange_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_argmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_argmin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_argsort_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_argwhere_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_as_strided_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_as_strided_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_as_strided_partial_views_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_as_strided_scatter_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_asinh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_atanh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_atleast_1d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_atleast_2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_atleast_3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_baddbmm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_bernoulli_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_bfloat16_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_block_diag_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_bmm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_broadcast_shapes_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_broadcast_tensors_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_broadcast_to_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_bucketize_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_cartesian_prod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_cat_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_cauchy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_cdist_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_cdouble_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_cfloat_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_chalf_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_cholesky_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_cholesky_inverse_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_cholesky_solve_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_chunk_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_clamp_max_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_clamp_min_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_clone_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_column_stack_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_combinations_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_complex_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_conj_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_conj_physical_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_constant_pad_nd_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_copysign_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_corrcoef_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_count_nonzero_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_cov_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_cross_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_cummax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_cummin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_cumprod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_cumsum_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_cumulative_trapezoid_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_deg2rad_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_diag_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_diag_embed_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_diagflat_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_diagonal_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_diagonal_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_diagonal_scatter_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_diff_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_digamma_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_dist_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_dot_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_dsplit_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_dstack_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_einsum_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_empty_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_empty_like_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_empty_permuted_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_empty_strided_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_equal_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_erfinv_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_exp2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_expand_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_exponential_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_eye_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_fft2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_fft_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_fftn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_fftshift_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_hfft2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_hfft_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_hfftn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_ifft2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_ifft_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_ifftn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_ifftshift_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_ihfft2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_ihfft_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_ihfftn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_irfft2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_irfft_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_irfftn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_rfft2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_rfft_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fft_rfftn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fill_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_flatten_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_flip_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fliplr_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_flipud_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_float_power_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_floor_divide_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_fmin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_frexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_full_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_full_like_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_gather_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_geometric_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_geqrf_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_gradient_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_grid_sampler_2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_hash_tensor_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_heaviside_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_histc_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_hsplit_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_hstack_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_hypot_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_i0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_igamma_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_igammac_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_index_add_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_index_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_index_fill_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_index_put_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_index_reduce_amax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_index_reduce_amin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_index_reduce_mean_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_index_reduce_prod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_index_select_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_inner_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_isclose_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_isfinite_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_isin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_isinf_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_isneginf_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_isposinf_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_isreal_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_item_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_jiterator_2inputs_2outputs_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_jiterator_4inputs_with_extra_args_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_jiterator_binary_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_jiterator_binary_return_by_ref_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_jiterator_unary_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_kron_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_kthvalue_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_ldexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_cholesky_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_cholesky_ex_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_cond_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_cross_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_det_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_diagonal_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_eig_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_eigh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_eigvals_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_eigvalsh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_householder_product_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_inv_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_inv_ex_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_ldl_factor_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_ldl_factor_ex_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_ldl_solve_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_lstsq_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_lstsq_grad_oriented_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_lu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_lu_factor_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_lu_factor_ex_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_lu_solve_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_matrix_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_matrix_power_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_matrix_rank_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_matrix_rank_hermitian_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_multi_dot_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_norm_subgradients_at_zero_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_pinv_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_pinv_hermitian_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_pinv_singular_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_qr_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_slogdet_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_solve_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_solve_ex_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_solve_triangular_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_svd_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_svdvals_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_tensorinv_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_tensorsolve_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_vander_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_vecdot_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linalg_vector_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linspace_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_linspace_tensor_overload_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_log_normal_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_log_softmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_log_softmax_with_dtype_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_logaddexp2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_logaddexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_logcumsumexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_logdet_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_logical_and_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_logical_not_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_logical_or_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_logical_xor_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_logit_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_logspace_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_logspace_tensor_overload_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_logsumexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_lu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_lu_solve_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_lu_unpack_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_mH_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_mT_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_amax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_amin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_argmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_argmin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_cumprod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_cumsum_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_log_softmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_logaddexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_logsumexp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_mean_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_median_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_normalize_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_prod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_scatter_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_select_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_softmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_softmin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_std_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_sum_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_masked_var_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_matrix_exp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_max_pool2d_with_indices_backward_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_max_reduction_no_dim_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_max_reduction_with_dim_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_maximum_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_median_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_meshgrid_list_of_tensors_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_meshgrid_variadic_tensors_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_min_reduction_no_dim_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_min_reduction_with_dim_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_minimum_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_mode_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_movedim_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_msort_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_multinomial_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_mv_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nan_to_num_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nanmean_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nanmedian_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nanquantile_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nansum_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_narrow_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_narrow_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_native_batch_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_native_dropout_backward_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_native_layer_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_new_empty_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_new_empty_strided_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_new_full_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_new_ones_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_new_zeros_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nextafter_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_alpha_dropout_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_avg_pool1d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_avg_pool2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_avg_pool3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_batch_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_bilinear_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_binary_cross_entropy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_celu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_channel_shuffle_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_conv1d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_conv2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_conv3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_conv_transpose1d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_conv_transpose2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_conv_transpose3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_cosine_embedding_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_cosine_similarity_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_cross_entropy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_ctc_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_dropout2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_dropout3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_dropout_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_elu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_embedding_bag_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_embedding_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_fractional_max_pool2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_fractional_max_pool3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_gaussian_nll_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_gelu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_glu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_grid_sample_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_group_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_hinge_embedding_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_huber_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_instance_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_interpolate_area_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_interpolate_bicubic_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_interpolate_bilinear_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_interpolate_linear_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_interpolate_nearest_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_interpolate_trilinear_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_kl_div_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_l1_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_layer_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_linear_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_local_response_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_logsigmoid_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_margin_ranking_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_max_pool1d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_max_pool2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_max_pool3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_max_unpool1d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_max_unpool1d_grad_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_max_unpool2d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_max_unpool2d_grad_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_max_unpool3d_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_max_unpool3d_grad_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_mish_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_mse_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_multi_head_attention_forward_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_multi_margin_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_multilabel_margin_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_nll_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_normalize_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_pad_circular_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_pad_constant_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_pad_reflect_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_pad_replicate_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_pad_replicate_negative_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_pairwise_distance_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_pdist_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_pixel_shuffle_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_pixel_unshuffle_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_poisson_nll_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_prelu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_rms_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_rrelu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_selu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_silu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_smooth_l1_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_soft_margin_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_softmin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_softmin_with_dtype_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_softshrink_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_triplet_margin_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_unfold_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_upsample_bilinear_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nn_functional_upsample_nearest_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nonzero_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_nonzero_static_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_norm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_norm_fro_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_norm_inf_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_norm_nuc_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_normal_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_normal_in_place_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_normal_number_mean_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_ones_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_ones_like_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_ormqr_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_outer_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_pca_lowrank_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_permute_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_pinverse_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_polar_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_polygamma_polygamma_n_0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_polygamma_polygamma_n_1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_polygamma_polygamma_n_2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_polygamma_polygamma_n_3_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_polygamma_polygamma_n_4_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_positive_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_prod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_put_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_qr_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_quantile_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_rad2deg_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_rand_like_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_randint_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_randint_like_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_randn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_randn_like_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_ravel_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_real_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_renorm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_repeat_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_repeat_interleave_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_resize__cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_resize_as__cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_resolve_conj_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_resolve_neg_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_roll_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_rot90_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_round_decimals_0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_round_decimals_3_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_round_decimals_neg_3_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_scalar_tensor_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_scatter_add_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_scatter_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_scatter_reduce_amax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_scatter_reduce_amin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_scatter_reduce_mean_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_scatter_reduce_prod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_scatter_reduce_sum_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_searchsorted_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_select_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_select_scatter_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_sgn_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_signal_windows_bartlett_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_signal_windows_blackman_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_signal_windows_cosine_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_signal_windows_exponential_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_signal_windows_gaussian_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_signal_windows_general_cosine_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_signal_windows_general_hamming_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_signal_windows_hamming_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_signal_windows_hann_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_signal_windows_kaiser_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_signal_windows_nuttall_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_signbit_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_sinc_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_slice_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_slice_scatter_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_softmax_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_softmax_with_dtype_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_sort_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_sparse_mm_reduce_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_sparse_sampled_addmm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_airy_ai_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_bessel_j0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_bessel_j1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_bessel_y0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_bessel_y1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_chebyshev_polynomial_t_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_chebyshev_polynomial_u_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_chebyshev_polynomial_v_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_chebyshev_polynomial_w_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_entr_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_erfcx_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_hermite_polynomial_h_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_hermite_polynomial_he_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_i0e_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_i1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_i1e_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_laguerre_polynomial_l_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_legendre_polynomial_p_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_log_ndtr_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_modified_bessel_i0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_modified_bessel_i1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_modified_bessel_k0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_modified_bessel_k1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_ndtr_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_ndtri_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_scaled_modified_bessel_k0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_scaled_modified_bessel_k1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_spherical_bessel_j0_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_xlog1py_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_special_zeta_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_split_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_split_list_args_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_split_with_sizes_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_split_with_sizes_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_square_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_squeeze_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_squeeze_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_squeeze_multiple_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_stack_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_std_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_std_mean_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_std_mean_unbiased_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_std_unbiased_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_stft_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_sum_to_size_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_svd_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_svd_lowrank_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_t_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_take_along_dim_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_take_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_tensor_split_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_tensordot_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_tile_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_to_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_to_sparse_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_topk_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_trace_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_transpose_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_trapezoid_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_trapz_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_triangular_solve_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_tril_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_triu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_unbind_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_unbind_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_unflatten_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_unfold_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_unfold_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_uniform_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_unique_consecutive_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_unique_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_unsafe_chunk_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_unsafe_split_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_unsqueeze_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_var_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_var_mean_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_var_mean_unbiased_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_var_unbiased_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_vdot_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_view_as_complex_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_view_copy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_vsplit_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_vstack_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_xlogy_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_zero__cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_zeros_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_unsupported_zeros_like_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working___radd___cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working___rdiv___cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working___rmod___cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working___rmul___cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_abs_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_acos_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_add_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_addcmul_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_addmm_decomposed_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_asin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_atan2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_atan_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_bool_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_byte_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_ceil_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_char_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_clamp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_contiguous_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_cos_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_cosh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_div_floor_rounding_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_div_no_rounding_mode_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_div_trunc_rounding_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_double_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_eq_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_erf_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_erfc_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_exp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_expand_as_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_expand_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_expm1_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_float_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_floor_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_fmod_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_ge_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_gt_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_half_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_int_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_isnan_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_le_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_lerp_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_lgamma_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_log10_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_log1p_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_log2_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_log_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_long_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_lt_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_masked_fill_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_max_binary_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_mean_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_min_binary_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_mm_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_mul_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_ne_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_neg_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_nn_functional_hardshrink_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_nn_functional_hardsigmoid_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_nn_functional_hardswish_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_nn_functional_hardtanh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_nn_functional_leaky_relu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_nn_functional_relu6_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_nn_functional_relu_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_nn_functional_softplus_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_nn_functional_softsign_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_nn_functional_tanhshrink_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_nn_functional_threshold_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_permute_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_pow_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_reciprocal_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_remainder_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_reshape_as_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_reshape_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_round_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_rsqrt_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_rsub_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_short_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_sigmoid_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_sign_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_sin_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_sinh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_sqrt_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_sub_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_sum_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_t_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_tan_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_tanh_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_transpose_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_true_divide_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_trunc_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_unsqueeze_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_view_as_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_view_cuda_float32, test/test_jit_fuser_te.py::TestNNCOpInfoCUDA::test_working_where_cuda_float32 2025-08-14T21:48:55.3250634Z 2025-08-14T21:48:58.9957158Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T21:48:58.9958527Z import pkg_resources 2025-08-14T21:48:59.3722559Z Running inductor/test_torchinductor_dynamic_shapes 1/2 ... [2025-08-14 21:48:59.371744] 2025-08-14T21:48:59.3723119Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:48:59.3726170Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_dynamic_shapes.py', '-m', 'not serial', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:48:59.372120] 2025-08-14T21:50:45.3232084Z 2025-08-14T21:50:45.3232938Z inductor/test_aot_inductor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_1.1_ed46402d2e231278_.log 2025-08-14T21:50:45.3590655Z Running 852 items in this shard: test/inductor/test_aot_inductor.py::AOTInductorLoggingTest::test_shape_env_reuse, test/inductor/test_aot_inductor.py::AOTInductorLoggingTest::test_shape_env_reuse_zero_consts_use_consts_asm_false, test/inductor/test_aot_inductor.py::TestAOTInductorConfig::test_compile_standalone_explicit_set, test/inductor/test_aot_inductor.py::TestAOTInductorConfig::test_compile_standalone_package_cpp_false_raises, test/inductor/test_aot_inductor.py::TestAOTInductorConfig::test_compile_standalone_sets_package_cpp, test/inductor/test_aot_inductor.py::TestAOTInductorConfig::test_no_compile_standalone, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__int_mm_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_m_32_n_64_q_group_32_num_groups_1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_m_32_n_64_q_group_32_num_groups_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_m_32_n_64_q_group_64_num_groups_1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_m_32_n_64_q_group_64_num_groups_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_32_num_groups_1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_32_num_groups_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_64_num_groups_1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_64_num_groups_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_add_complex_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_addmm_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_addmm_multiple_dynamic_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aliased_buffer_reuse_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_amp_fallback_random_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aot_inductor_consts_cpp_build_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_constant_tensor_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_constant_tensor_name_collision_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_codegen_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_cpp_kernel_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_fp8_dtype_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_sym_inputs_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_user_defined_triton_kernel_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printing_model_inputs_codegen_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_profiler_enable_kernel_profile_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_profiler_enable_kernel_profile_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_runtime_asserts_backed_symint_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_runtime_asserts_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_assert_async_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_assert_tensor_meta_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_autotune_with_constant_folding_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_autotuning_args_reuse_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_backward_no_op_logging_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_bmm_multiple_dynamic_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_bool_input_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_boolean_indexing_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_buffer_mutation_1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_buffer_mutation_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_buffer_mutation_3_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_buffer_mutation_4_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_buffer_mutation_and_force_mmap_weights_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_buffer_reuse_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_clamp_decomposition_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_composed_dynamic_size_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_mismatched_branch_output_dynamic_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_mismatched_branch_output_dynamic_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_nested_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_non_tensor_predicates_dynamic_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_non_tensor_predicates_dynamic_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_share_predicte_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_simple_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_symint_input_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_unbacked_symint_closure_dynamic_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_unbacked_symint_closure_dynamic_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_use_buffers_from_outer_scope_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_with_multiple_outputs_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_with_outer_code_before_after_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_with_parameters_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_with_reinterpret_view_inputs_outputs_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_consecutive_compiles_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_constant_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_constant_folding_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_constant_folding_with_update_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_constant_original_fqn_and_dtype_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_constant_type_propagation_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_conv3d_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_conv_freezing_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_convolution_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_copy_non_blocking_is_pinned_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_d2h_copy_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_deconv_freezing_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dup_unbacked_sym_decl_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dup_unbacked_sym_decl_with_refinement_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_duplicate_constant_folding_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_duplicated_params_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dynamic_cat_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dynamic_scalar_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dynamic_smem_above_default_limit_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_embedding_bag_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_empty_cat_dtype_promotion_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_empty_constant_folding_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_empty_graph_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_extract_constants_map_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fake_tensor_device_validation_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fallback_kernel_with_symexpr_output_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fallback_mem_leak_fix_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fft_c2c_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fill__fallback_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_foreach_multiple_dynamic_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fp8_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fp8_view_of_param_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fqn_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_free_inactive_buffer_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_freezing_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fx_gm_return_tuple_validation_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_index_put_fallback_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_index_put_with_none_index_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_inf_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_input_codegen_with_sympy_expr_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_int_list_input_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_issue_140766_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_large_dynamic_dim_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_large_grid_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_large_mmaped_weights_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_large_weight_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_linear_dynamic_maxautotune_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_linear_freezing_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_load_package_multiple_gpus_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_masked_select_dynamic_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_misaligned_input_1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_misaligned_input_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_misc_1_max_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_misc_1_max_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_missing_cubin_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_missing_output_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_model_modified_weights_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_multi_device_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_multiple_output_alias_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_nan_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_narrow_fallback_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_nested_tensor_from_jagged_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_no_args_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_non_contiguous_output_alias_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_non_default_gpu_device_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_non_tensor_input_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_none_args_aot_codegen_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_normal_functional_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_on_gpu_device1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_output_misaligned_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_output_path_1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_output_path_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_pad_fallback_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_poi_multiple_dynamic_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_profile_benchmark_harness_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_proxy_executor_abs_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_proxy_executor_hann_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_proxy_executor_permute_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_proxy_executor_squeeze_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_pytree_inputs_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_quanatized_int8_linear_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_quantized_linear_bias_none_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_quantized_linear_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_repeat_interleave_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_repeat_output_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_repeated_calling_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_repeated_user_defined_triton_kernel_embed_kernel_binary_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_repeated_user_defined_triton_kernel_embed_kernel_binary_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_replicate_on_devices_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_return_constant_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_return_view_constant_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_reuse_kernel_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_reuse_kernel_dynamic_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_run_with_grad_enabled_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_runtime_checks_complex_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_runtime_checks_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_runtime_checks_device_type_failed_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_runtime_checks_dtype_failed_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_runtime_checks_fp8_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_runtime_checks_large_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_runtime_checks_shape_failed_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_same_backing_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_scaled_dot_product_efficient_attention_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_scatter_fallback_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_scatter_reduce_fallback_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_sdpa_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_sdpa_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_seq_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_shifted_constraint_ranges_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_simple_dynamic_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_simple_embed_kernel_binary_False_max_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_simple_embed_kernel_binary_False_max_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_simple_embed_kernel_binary_True_max_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_simple_embed_kernel_binary_True_max_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_simple_multi_arch_embed_kernel_binary_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_simple_split_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_size_from_multi_output_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_size_with_unbacked_add_and_mul_expr_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_size_with_unbacked_add_expr_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_size_with_unbacked_add_expr_transitive_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_small_constant_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_so_without_weight_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_stft_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_stride_with_unbacked_expr_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_subclasses_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_sym_i64_input_codegen_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_symbool_item_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_symfloat_item_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_symint_item_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_sympy_cpp_printer_min_max_minmax0_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_sympy_cpp_printer_min_max_minmax1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_torchvision_transforms_functional_tensor_resize_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_autotuning_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_dynamic_launcher_grid_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_dynamic_launcher_grid_infer_from_tensor_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_bool_param_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_dynamic_grid_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_dynamic_shape_with_div_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_equal_to_1_arg_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_equal_to_1_float_arg_dynamic_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_equal_to_1_float_arg_dynamic_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_extern_kernel_arg_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_1_dynamic_True_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_1_dynamic_True_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_multi_output_arg_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_on_device_tma_dynamic_False_tma_version_new_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_on_device_tma_dynamic_False_tma_version_old_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_on_device_tma_dynamic_True_tma_version_new_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_on_device_tma_dynamic_True_tma_version_old_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_reinterpret_view_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_reinterpret_view_mem_leak_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_sympy_expr_arg_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_sympy_fn_like_arg_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_tma_descriptor_1d_dynamic_False_tma_version_new_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_tma_descriptor_1d_dynamic_False_tma_version_old_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_tma_descriptor_1d_dynamic_True_tma_version_new_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_tma_descriptor_1d_dynamic_True_tma_version_old_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_tma_descriptor_2d_dynamic_False_tma_version_new_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_tma_descriptor_2d_dynamic_False_tma_version_old_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_tma_descriptor_2d_dynamic_True_tma_version_new_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_tma_descriptor_2d_dynamic_True_tma_version_old_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_weird_param_order_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_with_none_input_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_with_none_inputs_and_equal_to_1_arg_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_mutated_autotuning_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_next_power_of_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_unbacked_equals_input_size_runtime_assertion_mark_unbacked_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_unbacked_equals_input_size_runtime_assertion_mark_unbacked_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_update_constant_buffer_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_update_constant_buffer_simple_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_update_inactive_constant_buffer_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_update_user_managed_buffer_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_upper_bound_i64_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_using_model_name_for_files_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_view_outputs_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_weight_on_disk_legacy_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_nested_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_simple_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_conv_dynamic_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_conv_dynamic_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_mixed_device_dynamic_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_mixed_device_dynamic_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_outer_buffers_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_outer_code_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_parameters_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_pytree_inputs_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_sym_expr_cond_dynamic_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_sym_expr_cond_dynamic_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_unbacked_symint_closure_dynamic_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_unbacked_symint_closure_dynamic_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_with_cudagraphs_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_with_no_triton_profiler_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_with_offset_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_with_profiler_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_zero_grid_with_backed_symbols_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_zero_grid_with_unbacked_symbols_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_zero_size_buffer_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_zero_size_weight_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__int_mm_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_m_32_n_64_q_group_32_num_groups_1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_m_32_n_64_q_group_32_num_groups_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_m_32_n_64_q_group_64_num_groups_1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_m_32_n_64_q_group_64_num_groups_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_32_num_groups_1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_32_num_groups_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_64_num_groups_1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_64_num_groups_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_add_complex_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_addmm_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_addmm_multiple_dynamic_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aliased_buffer_reuse_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_amp_fallback_random_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aot_inductor_consts_cpp_build_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_constant_tensor_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_constant_tensor_name_collision_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_debug_printer_codegen_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_debug_printer_cpp_kernel_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_debug_printer_fp8_dtype_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_debug_printer_sym_inputs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_debug_printer_user_defined_triton_kernel_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_debug_printing_model_inputs_codegen_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_profiler_enable_kernel_profile_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_profiler_enable_kernel_profile_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_runtime_asserts_backed_symint_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_runtime_asserts_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_assert_async_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_assert_tensor_meta_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_autotune_with_constant_folding_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_autotuning_args_reuse_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_backward_no_op_logging_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_bmm_multiple_dynamic_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_bool_input_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_boolean_indexing_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_buffer_mutation_1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_buffer_mutation_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_buffer_mutation_3_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_buffer_mutation_4_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_buffer_mutation_and_force_mmap_weights_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_buffer_reuse_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_clamp_decomposition_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_composed_dynamic_size_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_mismatched_branch_output_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_mismatched_branch_output_dynamic_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_nested_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_non_tensor_predicates_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_non_tensor_predicates_dynamic_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_share_predicte_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_simple_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_symint_input_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_unbacked_symint_closure_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_unbacked_symint_closure_dynamic_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_use_buffers_from_outer_scope_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_with_multiple_outputs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_with_outer_code_before_after_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_with_parameters_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_with_reinterpret_view_inputs_outputs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_consecutive_compiles_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_constant_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_constant_folding_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_constant_folding_with_update_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_constant_original_fqn_and_dtype_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_constant_type_propagation_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_conv3d_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_conv_freezing_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_convolution_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_copy_non_blocking_is_pinned_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_d2h_copy_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_deconv_freezing_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_dup_unbacked_sym_decl_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_dup_unbacked_sym_decl_with_refinement_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_duplicate_constant_folding_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_duplicated_params_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_dynamic_cat_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_dynamic_scalar_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_dynamic_smem_above_default_limit_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_embedding_bag_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_empty_cat_dtype_promotion_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_empty_constant_folding_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_empty_graph_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_extract_constants_map_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fake_tensor_device_validation_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fallback_kernel_with_symexpr_output_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fallback_mem_leak_fix_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fft_c2c_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fill__fallback_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_foreach_multiple_dynamic_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fp8_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fp8_view_of_param_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fqn_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_free_inactive_buffer_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_freezing_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fx_gm_return_tuple_validation_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_index_put_fallback_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_index_put_with_none_index_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_inf_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_input_codegen_with_sympy_expr_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_int_list_input_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_issue_140766_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_large_dynamic_dim_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_large_grid_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_large_mmaped_weights_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_large_weight_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_linear_dynamic_maxautotune_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_linear_freezing_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_load_package_multiple_gpus_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_masked_select_dynamic_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_misaligned_input_1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_misaligned_input_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_misc_1_max_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_misc_1_max_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_missing_cubin_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_missing_output_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_model_modified_weights_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_multi_device_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_multiple_output_alias_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_nan_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_narrow_fallback_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_nested_tensor_from_jagged_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_no_args_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_non_contiguous_output_alias_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_non_default_gpu_device_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_non_tensor_input_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_none_args_aot_codegen_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_normal_functional_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_on_gpu_device1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_output_misaligned_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_output_path_1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_output_path_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_pad_fallback_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_poi_multiple_dynamic_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_profile_benchmark_harness_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_proxy_executor_abs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_proxy_executor_hann_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_proxy_executor_permute_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_proxy_executor_squeeze_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_pytree_inputs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_quanatized_int8_linear_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_quantized_linear_bias_none_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_quantized_linear_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_repeat_interleave_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_repeat_output_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_repeated_calling_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_repeated_user_defined_triton_kernel_embed_kernel_binary_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_repeated_user_defined_triton_kernel_embed_kernel_binary_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_replicate_on_devices_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_return_constant_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_return_view_constant_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_reuse_kernel_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_reuse_kernel_dynamic_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_run_with_grad_enabled_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_runtime_checks_complex_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_runtime_checks_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_runtime_checks_device_type_failed_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_runtime_checks_dtype_failed_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_runtime_checks_fp8_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_runtime_checks_large_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_runtime_checks_shape_failed_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_same_backing_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_scaled_dot_product_efficient_attention_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_scatter_fallback_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_scatter_reduce_fallback_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_sdpa_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_sdpa_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_seq_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_shifted_constraint_ranges_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_simple_dynamic_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_simple_embed_kernel_binary_False_max_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_simple_embed_kernel_binary_False_max_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_simple_embed_kernel_binary_True_max_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_simple_embed_kernel_binary_True_max_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_simple_multi_arch_embed_kernel_binary_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_simple_split_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_size_from_multi_output_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_size_with_unbacked_add_and_mul_expr_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_size_with_unbacked_add_expr_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_size_with_unbacked_add_expr_transitive_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_small_constant_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_so_without_weight_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_stft_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_stride_with_unbacked_expr_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_subclasses_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_sym_i64_input_codegen_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_symbool_item_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_symfloat_item_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_symint_item_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_sympy_cpp_printer_min_max_minmax0_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_sympy_cpp_printer_min_max_minmax1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_torchvision_transforms_functional_tensor_resize_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_autotuning_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_dynamic_launcher_grid_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_dynamic_launcher_grid_infer_from_tensor_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_bool_param_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_dynamic_grid_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_dynamic_shape_with_div_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_equal_to_1_arg_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_equal_to_1_float_arg_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_equal_to_1_float_arg_dynamic_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_extern_kernel_arg_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_1_dynamic_True_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_1_dynamic_True_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_multi_output_arg_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_on_device_tma_dynamic_False_tma_version_new_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_on_device_tma_dynamic_False_tma_version_old_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_on_device_tma_dynamic_True_tma_version_new_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_on_device_tma_dynamic_True_tma_version_old_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_reinterpret_view_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_reinterpret_view_mem_leak_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_sympy_expr_arg_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_sympy_fn_like_arg_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_1d_dynamic_False_tma_version_new_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_1d_dynamic_False_tma_version_old_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_1d_dynamic_True_tma_version_new_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_1d_dynamic_True_tma_version_old_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_2d_dynamic_False_tma_version_new_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_2d_dynamic_False_tma_version_old_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_2d_dynamic_True_tma_version_new_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_2d_dynamic_True_tma_version_old_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_weird_param_order_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_with_none_input_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_with_none_inputs_and_equal_to_1_arg_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_mutated_autotuning_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_next_power_of_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_unbacked_equals_input_size_runtime_assertion_mark_unbacked_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_unbacked_equals_input_size_runtime_assertion_mark_unbacked_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_update_constant_buffer_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_update_constant_buffer_simple_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_update_inactive_constant_buffer_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_update_user_managed_buffer_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_upper_bound_i64_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_using_model_name_for_files_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_view_outputs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_weight_on_disk_legacy_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_nested_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_simple_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_conv_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_conv_dynamic_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_mixed_device_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_mixed_device_dynamic_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_outer_buffers_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_outer_code_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_parameters_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_pytree_inputs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_sym_expr_cond_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_sym_expr_cond_dynamic_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_unbacked_symint_closure_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_unbacked_symint_closure_dynamic_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_with_cudagraphs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_with_no_triton_profiler_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_with_offset_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_with_profiler_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_zero_grid_with_backed_symbols_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_zero_grid_with_unbacked_symbols_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_zero_size_buffer_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_zero_size_weight_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__int_mm_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__weight_int4pack_mm_m_32_n_64_q_group_32_num_groups_1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__weight_int4pack_mm_m_32_n_64_q_group_32_num_groups_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__weight_int4pack_mm_m_32_n_64_q_group_64_num_groups_1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__weight_int4pack_mm_m_32_n_64_q_group_64_num_groups_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_32_num_groups_1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_32_num_groups_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_64_num_groups_1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_64_num_groups_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_add_complex_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_addmm_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_addmm_multiple_dynamic_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aliased_buffer_reuse_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_amp_fallback_random_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aot_inductor_consts_cpp_build_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_constant_tensor_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_constant_tensor_name_collision_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_debug_printer_codegen_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_debug_printer_cpp_kernel_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_debug_printer_fp8_dtype_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_debug_printer_sym_inputs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_debug_printer_user_defined_triton_kernel_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_debug_printing_model_inputs_codegen_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_profiler_enable_kernel_profile_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_profiler_enable_kernel_profile_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_runtime_asserts_backed_symint_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_runtime_asserts_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_assert_async_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_assert_tensor_meta_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_autotune_with_constant_folding_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_autotuning_args_reuse_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_backward_no_op_logging_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_bmm_multiple_dynamic_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_bool_input_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_boolean_indexing_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_buffer_mutation_1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_buffer_mutation_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_buffer_mutation_3_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_buffer_mutation_4_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_buffer_mutation_and_force_mmap_weights_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_buffer_reuse_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_clamp_decomposition_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_composed_dynamic_size_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_mismatched_branch_output_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_mismatched_branch_output_dynamic_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_nested_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_non_tensor_predicates_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_non_tensor_predicates_dynamic_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_share_predicte_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_simple_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_symint_input_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_unbacked_symint_closure_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_unbacked_symint_closure_dynamic_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_use_buffers_from_outer_scope_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_with_multiple_outputs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_with_outer_code_before_after_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_with_parameters_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_with_reinterpret_view_inputs_outputs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_consecutive_compiles_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_constant_folding_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_constant_folding_with_update_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_constant_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_constant_original_fqn_and_dtype_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_constant_type_propagation_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_conv3d_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_conv_freezing_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_convolution_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_copy_non_blocking_is_pinned_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_d2h_copy_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_deconv_freezing_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_dup_unbacked_sym_decl_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_dup_unbacked_sym_decl_with_refinement_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_duplicate_constant_folding_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_duplicated_params_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_dynamic_cat_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_dynamic_scalar_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_dynamic_smem_above_default_limit_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_embedding_bag_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_empty_cat_dtype_promotion_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_empty_constant_folding_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_empty_graph_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_extract_constants_map_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fake_tensor_device_validation_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fallback_kernel_with_symexpr_output_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fallback_mem_leak_fix_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fft_c2c_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fill__fallback_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_foreach_multiple_dynamic_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fp8_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fp8_view_of_param_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fqn_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_free_inactive_buffer_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_freezing_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_fx_gm_return_tuple_validation_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_index_put_fallback_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_index_put_with_none_index_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_inf_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_input_codegen_with_sympy_expr_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_int_list_input_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_issue_140766_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_large_dynamic_dim_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_large_grid_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_large_mmaped_weights_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_large_weight_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_linear_dynamic_maxautotune_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_linear_freezing_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_load_package_multiple_gpus_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_masked_select_dynamic_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_misaligned_input_1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_misaligned_input_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_misc_1_max_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_misc_1_max_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_missing_cubin_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_missing_output_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_model_modified_weights_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_multi_device_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_multiple_output_alias_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_nan_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_narrow_fallback_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_nested_tensor_from_jagged_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_no_args_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_non_contiguous_output_alias_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_non_default_gpu_device_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_non_tensor_input_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_none_args_aot_codegen_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_normal_functional_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_on_gpu_device1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_output_misaligned_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_output_path_1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_output_path_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_pad_fallback_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_poi_multiple_dynamic_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_profile_benchmark_harness_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_proxy_executor_abs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_proxy_executor_hann_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_proxy_executor_permute_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_proxy_executor_squeeze_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_pytree_inputs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_quanatized_int8_linear_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_quantized_linear_bias_none_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_quantized_linear_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_repeat_interleave_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_repeat_output_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_repeated_calling_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_repeated_user_defined_triton_kernel_embed_kernel_binary_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_repeated_user_defined_triton_kernel_embed_kernel_binary_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_replicate_on_devices_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_return_constant_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_return_view_constant_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_reuse_kernel_dynamic_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_reuse_kernel_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_run_with_grad_enabled_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_runtime_checks_complex_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_runtime_checks_device_type_failed_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_runtime_checks_dtype_failed_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_runtime_checks_fp8_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_runtime_checks_large_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_runtime_checks_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_runtime_checks_shape_failed_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_same_backing_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_scaled_dot_product_efficient_attention_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_scatter_fallback_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_scatter_reduce_fallback_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_sdpa_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_sdpa_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_seq_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_shifted_constraint_ranges_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_simple_dynamic_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_simple_embed_kernel_binary_False_max_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_simple_embed_kernel_binary_False_max_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_simple_embed_kernel_binary_True_max_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_simple_embed_kernel_binary_True_max_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_simple_multi_arch_embed_kernel_binary_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_simple_split_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_size_from_multi_output_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_size_with_unbacked_add_and_mul_expr_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_size_with_unbacked_add_expr_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_size_with_unbacked_add_expr_transitive_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_small_constant_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_so_without_weight_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_stft_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_stride_with_unbacked_expr_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_subclasses_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_sym_i64_input_codegen_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_symbool_item_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_symfloat_item_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_symint_item_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_sympy_cpp_printer_min_max_minmax0_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_sympy_cpp_printer_min_max_minmax1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_torchvision_transforms_functional_tensor_resize_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_autotuning_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_dynamic_launcher_grid_infer_from_tensor_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_dynamic_launcher_grid_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_bool_param_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_dynamic_grid_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_dynamic_shape_with_div_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_equal_to_1_arg_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_equal_to_1_float_arg_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_equal_to_1_float_arg_dynamic_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_extern_kernel_arg_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_1_dynamic_True_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_1_dynamic_True_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_multi_output_arg_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_on_device_tma_dynamic_False_tma_version_new_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_on_device_tma_dynamic_False_tma_version_old_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_on_device_tma_dynamic_True_tma_version_new_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_on_device_tma_dynamic_True_tma_version_old_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_reinterpret_view_mem_leak_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_reinterpret_view_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_sympy_expr_arg_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_sympy_fn_like_arg_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_1d_dynamic_False_tma_version_new_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_1d_dynamic_False_tma_version_old_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_1d_dynamic_True_tma_version_new_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_1d_dynamic_True_tma_version_old_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_2d_dynamic_False_tma_version_new_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_2d_dynamic_False_tma_version_old_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_2d_dynamic_True_tma_version_new_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_2d_dynamic_True_tma_version_old_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_weird_param_order_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_with_none_input_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_with_none_inputs_and_equal_to_1_arg_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_mutated_autotuning_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_next_power_of_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_unbacked_equals_input_size_runtime_assertion_mark_unbacked_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_unbacked_equals_input_size_runtime_assertion_mark_unbacked_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_update_constant_buffer_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_update_constant_buffer_simple_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_update_inactive_constant_buffer_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_update_user_managed_buffer_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_upper_bound_i64_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_using_model_name_for_files_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_view_outputs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_weight_on_disk_legacy_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_nested_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_simple_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_conv_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_conv_dynamic_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_mixed_device_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_mixed_device_dynamic_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_outer_buffers_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_outer_code_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_parameters_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_pytree_inputs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_sym_expr_cond_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_sym_expr_cond_dynamic_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_unbacked_symint_closure_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_unbacked_symint_closure_dynamic_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_with_cudagraphs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_with_no_triton_profiler_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_with_offset_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_with_profiler_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_zero_grid_with_backed_symbols_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_zero_grid_with_unbacked_symbols_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_zero_size_buffer_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_zero_size_weight_mps 2025-08-14T21:50:45.3940956Z 2025-08-14T21:50:45.9305797Z Uploading artifacts took 0.60 seconds 2025-08-14T21:50:49.4499113Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T21:50:49.4500514Z import pkg_resources 2025-08-14T21:50:49.8292152Z Running inductor/test_torchinductor_codegen_dynamic_shapes 1/2 ... [2025-08-14 21:50:49.828800] 2025-08-14T21:50:49.8292876Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:50:49.8295455Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_codegen_dynamic_shapes.py', '-m', 'not serial', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:50:49.829199] 2025-08-14T21:53:59.6382386Z 2025-08-14T21:53:59.6383318Z test_linalg 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_linalg_1.1_f8db1d9fd815e06e_.log 2025-08-14T21:53:59.6762424Z Running 1224 items in this shard: test/test_linalg.py::TestLinalgCUDA::test_1_sized_with_0_strided_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_1_sized_with_0_strided_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_matmul_4bit_m_1_k_128_n_11008_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_matmul_4bit_m_1_k_128_n_4096_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_matmul_4bit_m_1_k_64_n_11008_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_matmul_4bit_m_1_k_64_n_4096_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_matmul_4bit_m_32_k_128_n_11008_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_matmul_4bit_m_32_k_128_n_4096_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_matmul_4bit_m_32_k_64_n_11008_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_matmul_4bit_m_32_k_64_n_4096_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_pack_4bit_weight_k_256_n_128_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_pack_4bit_weight_k_256_n_32_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_pack_4bit_weight_k_256_n_48_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_pack_4bit_weight_k_256_n_64_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_pack_4bit_weight_k_64_n_128_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_pack_4bit_weight_k_64_n_32_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_pack_4bit_weight_k_64_n_48_cuda, test/test_linalg.py::TestLinalgCUDA::test__dyn_quant_pack_4bit_weight_k_64_n_64_cuda, test/test_linalg.py::TestLinalgCUDA::test__int4_mm_m_32_k_32_n_48_cuda, test/test_linalg.py::TestLinalgCUDA::test__int4_mm_m_32_k_32_n_64_cuda, test/test_linalg.py::TestLinalgCUDA::test__int4_mm_m_32_k_64_n_48_cuda, test/test_linalg.py::TestLinalgCUDA::test__int4_mm_m_32_k_64_n_64_cuda, test/test_linalg.py::TestLinalgCUDA::test__int4_mm_m_64_k_32_n_48_cuda, test/test_linalg.py::TestLinalgCUDA::test__int4_mm_m_64_k_32_n_64_cuda, test/test_linalg.py::TestLinalgCUDA::test__int4_mm_m_64_k_64_n_48_cuda, test/test_linalg.py::TestLinalgCUDA::test__int4_mm_m_64_k_64_n_64_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_large_shape_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_32_n_48_compile_False_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_32_n_48_compile_False_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_32_n_48_compile_True_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_32_n_48_compile_True_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_32_n_64_compile_False_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_32_n_64_compile_False_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_32_n_64_compile_True_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_32_n_64_compile_True_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_64_n_48_compile_False_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_64_n_48_compile_False_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_64_n_48_compile_True_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_64_n_48_compile_True_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_64_n_64_compile_False_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_64_n_64_compile_False_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_64_n_64_compile_True_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_32_k_64_n_64_compile_True_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_32_n_48_compile_False_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_32_n_48_compile_False_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_32_n_48_compile_True_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_32_n_48_compile_True_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_32_n_64_compile_False_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_32_n_64_compile_False_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_32_n_64_compile_True_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_32_n_64_compile_True_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_64_n_48_compile_False_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_64_n_48_compile_False_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_64_n_48_compile_True_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_64_n_48_compile_True_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_64_n_64_compile_False_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_64_n_64_compile_False_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_64_n_64_compile_True_slice_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int8_mm_m_64_k_64_n_64_compile_True_slice_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_0_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_16_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_0_k_32_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_0_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_16_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_17_k_32_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_0_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_16_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_16_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_False_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_False_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_True_use_transpose_b_False_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_0_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_1_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_cpu_m_8_k_32_n_32_use_transpose_a_True_use_transpose_b_True_non_contig_type_2_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_errors_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_16_n_16_use_transpose_a_False_use_transpose_b_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_16_n_16_use_transpose_a_False_use_transpose_b_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_16_n_16_use_transpose_a_True_use_transpose_b_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_16_n_16_use_transpose_a_True_use_transpose_b_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_16_n_32_use_transpose_a_False_use_transpose_b_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_16_n_32_use_transpose_a_False_use_transpose_b_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_16_n_32_use_transpose_a_True_use_transpose_b_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_16_n_32_use_transpose_a_True_use_transpose_b_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_32_n_16_use_transpose_a_False_use_transpose_b_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_32_n_16_use_transpose_a_False_use_transpose_b_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_32_n_16_use_transpose_a_True_use_transpose_b_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_32_n_16_use_transpose_a_True_use_transpose_b_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_32_n_32_use_transpose_a_False_use_transpose_b_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_32_n_32_use_transpose_a_False_use_transpose_b_True_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_32_n_32_use_transpose_a_True_use_transpose_b_False_cuda, test/test_linalg.py::TestLinalgCUDA::test__int_mm_k_32_n_32_use_transpose_a_True_use_transpose_b_True_cuda, test/test_linalg.py::TestLinalgCUDA::test_addbmm_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addbmm_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_addbmm_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_addbmm_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addbmm_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addbmm_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_addmm_baddbmm_overflow_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_addmm_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_addmm_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_addmm_gelu_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_gelu_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_gelu_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_gelu_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_0_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_0_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_0_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_0_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_0_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_0_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_0_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_0_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_0_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_2_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_2_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_2_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_2_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_2_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_2_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_2_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_2_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_0_2_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_1_0_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_1_0_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_1_0_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_1_0_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_1_0_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_1_0_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_1_0_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_1_0_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_False_alpha_1_0_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_0_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_0_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_0_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_0_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_0_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_0_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_0_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_0_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_0_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_2_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_2_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_2_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_2_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_2_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_2_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_2_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_2_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_0_2_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_1_0_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_1_0_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_1_0_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_1_0_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_1_0_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_1_0_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_1_0_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_1_0_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_False_transpose_b_True_alpha_1_0_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_0_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_0_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_0_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_0_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_0_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_0_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_0_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_0_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_0_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_2_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_2_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_2_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_2_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_2_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_2_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_2_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_2_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_0_2_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_1_0_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_1_0_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_1_0_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_1_0_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_1_0_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_1_0_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_1_0_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_1_0_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_False_alpha_1_0_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_0_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_0_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_0_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_0_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_0_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_0_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_0_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_0_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_0_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_2_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_2_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_2_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_2_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_2_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_2_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_2_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_2_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_0_2_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_1_0_beta_0_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_1_0_beta_0_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_1_0_beta_0_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_1_0_beta_0_5_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_1_0_beta_0_5_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_1_0_beta_0_5_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_1_0_beta_1_0_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_1_0_beta_1_0_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_mv_transpose_a_True_transpose_b_True_alpha_1_0_beta_1_0_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_relu_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_relu_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_relu_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_relu_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_addmm_relu_tunableop_rocm_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmm_relu_tunableop_rocm_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmm_relu_tunableop_rocm_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_relu_tunableop_rocm_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_addmm_sizes_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_addmm_sizes_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_addmm_sizes_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmm_sizes_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_addmv_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmv_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_addmv_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_addmv_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addmv_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmv_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_addmv_rowmajor_colmajor_incx_incy_lda_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addmv_rowmajor_colmajor_incx_incy_lda_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addmv_rowmajor_colmajor_incx_incy_lda_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_addr_bool_cuda_bool, test/test_linalg.py::TestLinalgCUDA::test_addr_float_and_complex_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_addr_float_and_complex_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_addr_float_and_complex_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_addr_float_and_complex_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_addr_float_and_complex_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_addr_float_and_complex_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_addr_integral_cuda_int16, test/test_linalg.py::TestLinalgCUDA::test_addr_integral_cuda_int32, test/test_linalg.py::TestLinalgCUDA::test_addr_integral_cuda_int64, test/test_linalg.py::TestLinalgCUDA::test_addr_integral_cuda_int8, test/test_linalg.py::TestLinalgCUDA::test_addr_integral_cuda_uint8, test/test_linalg.py::TestLinalgCUDA::test_addr_type_promotion_cuda, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_input_dtypes_compatibility_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_input_dtypes_compatibility_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_input_dtypes_compatibility_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_input_dtypes_compatibility_cuda_int16, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_input_dtypes_compatibility_cuda_int32, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_input_dtypes_compatibility_cuda_int64, test/test_linalg.py::TestLinalgCUDA::test_baddbmm_nan_input_with_zero_beta_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_blas_alpha_beta_empty_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_blas_alpha_beta_empty_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_blas_alpha_beta_empty_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_blas_alpha_beta_empty_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_blas_alpha_beta_empty_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_blas_alpha_beta_empty_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_blas_empty_cuda, test/test_linalg.py::TestLinalgCUDA::test_blas_mv_large_input_cuda, test/test_linalg.py::TestLinalgCUDA::test_blas_nan_out_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_blas_nan_out_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_blas_nan_out_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_blas_nan_out_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_blas_nan_out_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_blas_nan_out_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_blaslog_tunableop_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_bmm_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_bmm_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_bmm_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_bmm_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_bmm_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_bmm_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_bmm_tunableop_rocm_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_broadcast_batched_matmul_cuda, test/test_linalg.py::TestLinalgCUDA::test_broadcast_fused_matmul_cuda, test/test_linalg.py::TestLinalgCUDA::test_call_count_tunableop_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_chain_matmul_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cholesky_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cholesky_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cholesky_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cholesky_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_ex_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cholesky_ex_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_ex_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cholesky_ex_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_ex_non_pd_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cholesky_ex_non_pd_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_ex_non_pd_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cholesky_ex_non_pd_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_inverse_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cholesky_inverse_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_inverse_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cholesky_inverse_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_inverse_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cholesky_inverse_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_inverse_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cholesky_inverse_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_backward_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_broadcasting_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_broadcasting_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_broadcasting_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_broadcasting_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_many_batches_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_many_batches_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_many_batches_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_batched_many_batches_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_out_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_out_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_out_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cholesky_solve_out_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_ck_blas_library_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_dyn_quant_matmul_4bit_m_1_k_128_n_11008_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_dyn_quant_matmul_4bit_m_1_k_128_n_4096_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_dyn_quant_matmul_4bit_m_1_k_64_n_11008_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_dyn_quant_matmul_4bit_m_1_k_64_n_4096_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_dyn_quant_matmul_4bit_m_32_k_128_n_11008_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_dyn_quant_matmul_4bit_m_32_k_128_n_4096_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_dyn_quant_matmul_4bit_m_32_k_64_n_11008_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_dyn_quant_matmul_4bit_m_32_k_64_n_4096_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_int4_mm_m_32_k_32_n_48_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_int4_mm_m_32_k_32_n_64_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_int4_mm_m_32_k_64_n_48_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_int4_mm_m_32_k_64_n_64_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_int4_mm_m_64_k_32_n_48_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_int4_mm_m_64_k_32_n_64_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_int4_mm_m_64_k_64_n_48_cuda, test/test_linalg.py::TestLinalgCUDA::test_compile_int4_mm_m_64_k_64_n_64_cuda, test/test_linalg.py::TestLinalgCUDA::test_cond_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cond_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cond_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cond_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cond_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_cond_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cond_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cond_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_corner_cases_of_cublasltmatmul_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_corner_cases_of_cublasltmatmul_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_corner_cases_of_cublasltmatmul_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_corner_cases_of_cublasltmatmul_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_corner_cases_of_cublasltmatmul_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_corner_cases_of_cublasltmatmul_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_cross_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cross_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_cross_error_cuda, test/test_linalg.py::TestLinalgCUDA::test_cross_with_and_without_dim_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_cross_with_and_without_dim_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_det_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_det_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_det_logdet_slogdet_batched_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_det_logdet_slogdet_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_disable_tuning_tunableop_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_dot_invalid_args_cuda, test/test_linalg.py::TestLinalgCUDA::test_dot_vs_numpy_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_dot_vs_numpy_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_dump_results_on_exit_tunableop_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eig_check_magma_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eig_compare_backends_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eig_compare_backends_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_eig_compare_backends_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eig_compare_backends_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eig_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eig_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_eig_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eig_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eig_numpy_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eig_numpy_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eig_removed_error_cuda, test/test_linalg.py::TestLinalgCUDA::test_eig_with_nan_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eig_with_nan_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_eig_with_nan_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eig_with_nan_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eigh_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eigh_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_eigh_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eigh_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eigh_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eigh_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_eigh_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eigh_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eigh_lower_uplo_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eigh_lower_uplo_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_eigh_lower_uplo_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eigh_lower_uplo_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eigh_lwork_lapack_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eigh_lwork_lapack_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_eigh_lwork_lapack_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eigh_lwork_lapack_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eigh_svd_illcondition_matrix_input_should_not_crash_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eigh_svd_illcondition_matrix_input_should_not_crash_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eigvals_compare_backends_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eigvals_compare_backends_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_eigvals_compare_backends_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eigvals_compare_backends_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eigvals_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eigvals_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_eigvals_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eigvals_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eigvals_numpy_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eigvals_numpy_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eigvalsh_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eigvalsh_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_eigvalsh_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eigvalsh_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_eigvalsh_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_eigvalsh_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_eigvalsh_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_eigvalsh_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_einsum_corner_cases_cuda, test/test_linalg.py::TestLinalgCUDA::test_einsum_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_einsum_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_einsum_error_cases_cuda, test/test_linalg.py::TestLinalgCUDA::test_einsum_random_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_einsum_random_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_einsum_sublist_format_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_einsum_sublist_format_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_32_k_32_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_32_k_35_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_32_k_36_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_32_k_40_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_32_k_64_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_35_k_32_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_35_k_35_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_35_k_36_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_35_k_40_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_35_k_64_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_36_k_32_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_36_k_35_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_36_k_36_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_36_k_40_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_36_k_64_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_40_k_32_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_40_k_35_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_40_k_36_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_40_k_40_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_40_k_64_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_64_k_32_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_64_k_35_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_64_k_36_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_64_k_40_cuda, test/test_linalg.py::TestLinalgCUDA::test_fp16_mv_transposed_first_argument_arm_cpu_m_64_k_64_cuda, test/test_linalg.py::TestLinalgCUDA::test_gemm_bias_offline_tunableop_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_gemm_bias_tunableop_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_geqrf_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_geqrf_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_geqrf_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_geqrf_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_householder_product_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_householder_product_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_householder_product_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_householder_product_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_householder_product_errors_and_warnings_cuda, test/test_linalg.py::TestLinalgCUDA::test_inner_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_inner_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_inv_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_inv_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_inv_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_inv_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_inv_ex_info_device_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_inv_ex_info_device_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_inv_ex_info_device_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_inv_ex_info_device_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_inv_ex_singular_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_inv_ex_singular_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_inv_ex_singular_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_inv_ex_singular_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_invariance_error_spectral_decompositions_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_inverse_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_inverse_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_inverse_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_inverse_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_inverse_errors_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_inverse_errors_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_inverse_errors_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_inverse_errors_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_inverse_errors_large_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_inverse_errors_large_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_inverse_errors_large_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_inverse_errors_large_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_inverse_many_batches_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_inverse_many_batches_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_inverse_many_batches_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_inverse_many_batches_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_kron_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_kron_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_kron_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_kron_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_kron_empty_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_kron_empty_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_kron_empty_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_kron_empty_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_kron_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_kron_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_kron_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_kron_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_lapack_empty_cuda, test/test_linalg.py::TestLinalgCUDA::test_large_bmm_backward_cuda, test/test_linalg.py::TestLinalgCUDA::test_large_bmm_mm_backward_cuda, test/test_linalg.py::TestLinalgCUDA::test_ldl_factor_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_ldl_factor_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_ldl_factor_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_ldl_factor_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_ldl_solve_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_ldl_solve_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_ldl_solve_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_ldl_solve_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_cross_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_cross_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_cross_with_and_without_dim_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_cross_with_and_without_dim_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_batch_broadcasting_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_batch_broadcasting_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_batch_broadcasting_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_batch_broadcasting_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_input_checks_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_input_checks_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_input_checks_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_lstsq_input_checks_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_cpu_errors_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_cpu_errors_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_cpu_errors_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_cpu_errors_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_family_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_family_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_family_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_family_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_solve_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_solve_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_solve_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_lu_solve_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_analytic_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_analytic_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_analytic_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_analytic_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_batch_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_batch_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_boundary_cases_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_boundary_cases_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_boundary_cases_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_boundary_cases_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_compare_with_taylor_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_compare_with_taylor_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_compare_with_taylor_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_compare_with_taylor_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_no_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_perverse_nan_values_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_perverse_nan_values_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_perverse_nan_values_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_perverse_nan_values_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_utils_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_matrix_exp_utils_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_qr_autograd_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_broadcasting_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_broadcasting_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_broadcasting_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_broadcasting_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_large_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_large_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_large_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_linalg_solve_triangular_large_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_linear_algebra_scalar_raises_cuda, test/test_linalg.py::TestLinalgCUDA::test_lobpcg_basic_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_lobpcg_ortho_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_lobpcg_scipy_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_lobpcg_torchscript_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_lower_precision_accumulation_with_ref_path_cuda, test/test_linalg.py::TestLinalgCUDA::test_lstsq_removed_error_cuda, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_broadcasting_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_broadcasting_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_broadcasting_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_broadcasting_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_many_batches_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_many_batches_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_many_batches_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_batched_many_batches_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_large_matrices_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_large_matrices_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_large_matrices_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_lu_solve_large_matrices_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_lu_unpack_check_input_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_matmul_45724_cuda, test/test_linalg.py::TestLinalgCUDA::test_matmul_check_entries_tunableop_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_matmul_mv_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_matmul_mv_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_matmul_mv_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matmul_offline_mgpu_tunableop_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matmul_offline_tunableop_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_matmul_out_kernel_errors_with_autograd_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_matmul_out_kernel_errors_with_autograd_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matmul_small_brute_force_1d_Nd_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_matmul_small_brute_force_1d_Nd_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matmul_small_brute_force_2d_Nd_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_matmul_small_brute_force_2d_Nd_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matmul_small_brute_force_3d_Nd_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_matmul_small_brute_force_3d_Nd_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matmul_small_brute_force_tunableop_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_matmul_small_brute_force_tunableop_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matmul_small_brute_force_tunableop_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_matrix_norm_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matrix_norm_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_matrix_power_negative_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_matrix_power_negative_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_matrix_power_non_negative_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_matrix_power_non_negative_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_atol_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_atol_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_atol_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_atol_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_atol_rtol_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_basic_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_basic_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_basic_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_basic_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_empty_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_empty_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_empty_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_empty_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_out_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_out_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_out_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_out_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_matrix_rank_removed_error_cuda, test/test_linalg.py::TestLinalgCUDA::test_minimum_tuning_iteration_tunableop_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_mm_bmm_non_memory_dense_cuda, test/test_linalg.py::TestLinalgCUDA::test_mm_conjtranspose_cuda, test/test_linalg.py::TestLinalgCUDA::test_mm_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_mm_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_mm_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_mm_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_mm_empty_inputs_mixed_dtype_errors_cuda, test/test_linalg.py::TestLinalgCUDA::test_mm_submatrix_offline_tunableop_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_multi_dot_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_multi_dot_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_multi_dot_errors_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_norm_bfloat16_and_half_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_norm_bfloat16_and_half_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_norm_complex_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_norm_complex_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_norm_complex_old_cuda, test/test_linalg.py::TestLinalgCUDA::test_norm_complexhalf_cuda, test/test_linalg.py::TestLinalgCUDA::test_norm_dtype_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_norm_dtype_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_norm_dtype_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_norm_dtype_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_norm_dtype_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_norm_dtype_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_norm_errors_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_norm_errors_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_norm_extreme_values_cuda, test/test_linalg.py::TestLinalgCUDA::test_norm_fastpaths_cuda, test/test_linalg.py::TestLinalgCUDA::test_norm_fro_2_equivalence_old_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_norm_fused_type_promotion_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_norm_fused_type_promotion_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_norm_matrix_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_norm_matrix_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_norm_matrix_degenerate_shapes_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_norm_matrix_degenerate_shapes_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_norm_matrix_degenerate_shapes_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_norm_matrix_degenerate_shapes_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_norm_old_cuda, test/test_linalg.py::TestLinalgCUDA::test_norm_old_nan_propagation_cuda, test/test_linalg.py::TestLinalgCUDA::test_norm_vector_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_norm_vector_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_norm_vector_degenerate_shapes_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_norm_vector_degenerate_shapes_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_norm_vector_degenerate_shapes_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_norm_vector_degenerate_shapes_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_nuclear_norm_axes_small_brute_force_old_cuda, test/test_linalg.py::TestLinalgCUDA::test_nuclear_norm_exceptions_old_cuda, test/test_linalg.py::TestLinalgCUDA::test_nuclear_norm_out_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_nuclear_norm_out_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_numeric_check_leak_tunableop_rocm_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_batched_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_batched_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_batched_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_batched_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_batched_many_batches_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_batched_upper_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_batched_upper_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_batched_upper_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_batched_upper_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_empty_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_empty_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_empty_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_old_cholesky_empty_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_ormqr_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_ormqr_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_ormqr_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_ormqr_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_ormqr_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_ormqr_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_ormqr_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_ormqr_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_cuda_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_ger_addr_legacy_tests_cuda, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bfloat16_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_bool_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex128_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_complex64_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float16_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float32_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_float64_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int16_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int32_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int64_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_int8_uint8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_bool, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_complex128, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_complex64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_float16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_float32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_float64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_int16, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_int32, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_int64, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_int8, test/test_linalg.py::TestLinalgCUDA::test_outer_type_promotion_cuda_uint8_uint8, test/test_linalg.py::TestLinalgCUDA::test_pca_lowrank_cuda, test/test_linalg.py::TestLinalgCUDA::test_permute_matmul_cuda, test/test_linalg.py::TestLinalgCUDA::test_pinv_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_pinv_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_pinv_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_pinv_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_pinv_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_pinv_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_pinv_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_pinv_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_pinverse_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_pinverse_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_pinverse_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_pinverse_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_preferred_blas_library_cuda, test/test_linalg.py::TestLinalgCUDA::test_preferred_linalg_library_cuda, test/test_linalg.py::TestLinalgCUDA::test_qr_batched_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_qr_batched_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_qr_batched_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_qr_batched_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_qr_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_qr_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_qr_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_qr_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_qr_error_cases_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_qr_vs_numpy_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_qr_vs_numpy_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_qr_vs_numpy_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_qr_vs_numpy_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_renorm_cuda, test/test_linalg.py::TestLinalgCUDA::test_renorm_ps_cuda, test/test_linalg.py::TestLinalgCUDA::test_rotating_buffer_tunableop_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_rowwise_scaled_gemm_numerics_tunableop_cuda_float8_e4m3fnuz, test/test_linalg.py::TestLinalgCUDA::test_scaled_gemm_offline_tunableop_cuda_float8_e4m3fnuz, test/test_linalg.py::TestLinalgCUDA::test_scaled_gemm_offline_tunableop_cuda_float8_e5m2fnuz, test/test_linalg.py::TestLinalgCUDA::test_scaled_gemm_tunableop_cuda_float8_e4m3fnuz, test/test_linalg.py::TestLinalgCUDA::test_scaled_gemm_tunableop_cuda_float8_e5m2fnuz, test/test_linalg.py::TestLinalgCUDA::test_slogdet_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_slogdet_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_slogdet_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_slogdet_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_slogdet_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_slogdet_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_slogdet_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_slogdet_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_solve_batched_broadcasting_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_solve_batched_broadcasting_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_solve_batched_broadcasting_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_solve_batched_broadcasting_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_solve_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_solve_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_solve_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_solve_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_solve_removed_error_cuda, test/test_linalg.py::TestLinalgCUDA::test_strided_mm_bmm_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_strided_mm_bmm_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_svd_lowrank_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_svd_lowrank_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_symeig_removed_error_cuda, test/test_linalg.py::TestLinalgCUDA::test_tensordot_cuda, test/test_linalg.py::TestLinalgCUDA::test_tensordot_out_kernel_errors_with_autograd_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_tensordot_out_kernel_errors_with_autograd_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_empty_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_empty_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_empty_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_empty_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_singular_input_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_singular_input_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_singular_input_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_tensorinv_singular_input_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_tensorsolve_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_tensorsolve_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_tensorsolve_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_tensorsolve_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_tensorsolve_empty_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_tensorsolve_empty_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_tensorsolve_empty_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_tensorsolve_empty_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_tensorsolve_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_tf32_offline_tunableop_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_tf32_tunableop_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_broadcasting_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_broadcasting_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_broadcasting_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_broadcasting_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_many_batches_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_many_batches_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_many_batches_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_batched_many_batches_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_large_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_out_errors_and_warnings_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_out_errors_and_warnings_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_out_errors_and_warnings_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_triangular_solve_out_errors_and_warnings_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_validator_tunableop_rocm_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_vdot_invalid_args_cuda, test/test_linalg.py::TestLinalgCUDA::test_vdot_vs_numpy_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_vdot_vs_numpy_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_cuda_bfloat16, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_cuda_float16, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_cuda_float64, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_decom_unbacked_checks_cuda, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_dim_tuple_arg_cuda, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_extreme_values_cuda, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_reduce_over_1D_vector_cuda_complex128, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_reduce_over_1D_vector_cuda_complex64, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_reduce_over_1D_vector_cuda_float32, test/test_linalg.py::TestLinalgCUDA::test_vector_norm_reduce_over_1D_vector_cuda_float64 2025-08-14T21:53:59.7126192Z 2025-08-14T21:54:03.7254122Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T21:54:03.7255494Z import pkg_resources 2025-08-14T21:54:04.1076100Z Running inductor/test_torchinductor_codegen_dynamic_shapes 2/2 ... [2025-08-14 21:54:04.107128] 2025-08-14T21:54:04.1076644Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:54:04.1081689Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_codegen_dynamic_shapes.py', '-m', 'not serial', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:54:04.107514] 2025-08-14T21:59:09.8744211Z 2025-08-14T21:59:09.8745305Z inductor/test_torchinductor_codegen_dynamic_shapes 2/2 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_codegen_dynamic_shapes_2.2_c50a59be0fe94b5e_.log 2025-08-14T21:59:09.9176415Z Running 846 items in this shard: test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_AllenaiLongformerBase_repro_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test__dyn_quant_matmul_4bit_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test__dyn_quant_pack_4bit_weight_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test__unsafe_masked_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test__unsafe_masked_index_put_accumulate_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_adaptive_avg_pool2d1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_adaptive_avg_pool2d2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_adaptive_avg_pool2d_low_prec_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_adaptive_avg_pool_errors_with_long_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_adaptive_max_pool2d1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_adaptive_max_pool2d3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_add_complex4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_add_complex5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_add_complex_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_add_const_int_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_add_inplace_permuted_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_adding_tensor_offsets_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_addmm_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_alexnet_prefix_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_angle_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_any_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_aoti_eager_cache_hit_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_aoti_eager_dtype_device_layout_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_aoti_eager_override_registration_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_aoti_eager_support_str_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_aoti_eager_with_persistent_cache_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_arange1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_argmax_argmin1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_argmax_argmin2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_argmax_argmin_with_nan_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_argmax_to_float_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_as_strided_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d6_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d7_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d_backward4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_batch_norm_2d_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bitwise2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bitwise3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bitwise_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bmm1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bool_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_add_autotune_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_broadcast_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int16_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int16_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int16_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int32_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int32_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int32_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int64_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int64_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int8_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_uint8_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_uint8_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_nd_tiling_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_buffer_batch_norm_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_buffer_copied_in_graph_with_different_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_buffer_use_after_remove_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_builtins_round_float_ndigits_neg_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_builtins_round_float_ndigits_pos_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_empty_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_extern_kernel_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_single_empty_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_unbacked_legacy_empty_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cauchy_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_check_stack_no_cycles_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_clamp_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_clamp_type_promotion_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_clone_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_complex_fallback_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_computed_buffer_inlining_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_concat_add_inplace_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_config_option_dont_assume_alignment_cudagraphs_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_consecutive_split_cumsum_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_constant_pad_1d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_constant_pad_2d_strides_nonpositive_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_constant_pad_3d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_constant_pad_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_conv2d_backward_channels_last_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_conv3d_channels_last_use_block_ptr_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_conv_backward_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_conv_inference_heuristics_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_conv_shape_check_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_conv_with_as_strided_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_convolution1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_convolution2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_convolution3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_convolution5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_copy_non_blocking_is_pinned_use_cat_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cummin_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cumprod_zero_dim_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cumsum_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cumsum_inf_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cumsum_zero_dim_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_custom_op_1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_custom_op_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_custom_op_3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_custom_op_fixed_layout_channels_last_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_custom_op_unbacked_symints_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_custom_scan_op_compiled_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_custom_scan_op_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_custom_scan_op_multi_input_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dense_mask_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_deterministic_codegen_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_deterministic_codegen_on_graph_break_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dist_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div_precision_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div_presicion_accuracy_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div_prim_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div_softmax_symfloat_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dont_constant_fold_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dropout2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dropout3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dropout_deterministic_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dropout_trivial_0_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtype_sympy_expr_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_bfloat16_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_bfloat16_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_bfloat16_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_bfloat16_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_bfloat16_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float16_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float16_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float16_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float16_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float16_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float16_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float32_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float32_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float64_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float64_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float64_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float64_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float64_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_fusion_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int16_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int16_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int16_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int16_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int16_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int32_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int32_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int32_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int64_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int64_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int64_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int64_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int64_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int64_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int8_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int8_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_uint8_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_uint8_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_uint8_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_uint8_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_uint8_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_uint8_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_uint8_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_elu_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_embedding_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_embedding_sparse_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_empty1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_empty2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_erfc_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_erfinv_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_expand_as_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_expanded_reduction_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_expm1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fallback_mutable_op_no_mutated_tensors_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fallback_mutable_op_with_return_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fft_real_input_real_output_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fill1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fill2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_flip_cat_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_float16_to_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_float_index_expression_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_float_index_expression_type_promotion_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_float_repr_dynamic_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_floordiv_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fmod_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fmod_zero_dim_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_forced_buffer_realize_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fractional_max_pool2d3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fractional_max_pool2d4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fractional_max_pool2d5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_full_boolean_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_full_like_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_full_truncation_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fuse_large_params_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fusing_write_into_disjoint_read_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_gather3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_generate_rand_fp8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_generated_code_has_size_stride_assert_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_getitem_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_glu_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_graph_partition_arange1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_graph_partition_arange2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_graph_partition_argmax_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_graph_partition_both_scalars_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_graph_partition_constant_tensor1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_graph_partition_misaligned_input_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_graph_partition_pad_dynamic_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_graph_partition_refcount_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_graph_partition_scalar_inputs_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_hardsigmoid_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_hardswish_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_propagation_device_assert_masked_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_propagation_floordiv_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_propagation_nested_indirect_indexing_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_propagation_remainder_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_put1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_put3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_put_deterministic_fallback_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_put_fallback1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_put_fallback2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_put_reinplace_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_remainder_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_inductor_assert_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_inductor_layout_optimization_input_mutations_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_inductor_triton_bucketize_respects_masking_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_inf_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_inplace_activations_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_inplace_add_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_inplace_mixed_dtype_ops_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_inplace_resize_as_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_inplace_where_pointwise_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_input_mutation1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_input_mutation3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_input_mutation4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_insignificant_strides_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_int_input_dynamic_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_isinf_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_issue102546_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_kernel_names_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_kwargs_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_large_broadcast_reduction_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_large_grid_use_block_ptr_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_large_offset_pointwise_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_large_strided_reduction_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_large_tensor_reduction_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_layer_norm_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_leaky_relu_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_like_rands2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_like_rands3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_linear2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_linear_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_linear_mixed_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_linspace2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_list_clearing_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_log1p_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_log2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_log_fp64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_log_softmax_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_logcumsumexp_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_low_memory_max_pool_dilation_1_dim_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_low_memory_max_pool_dilation_2_dim_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_low_memory_max_pool_dilation_2_dim_3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d6_dilation_1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d6_dilation_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d_with_indices_backward3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d_with_indices_backward5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d_with_indices_backward6_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_min_max_reduction_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_min_max_reduction_nan_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mixed_mm2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mixed_mm3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mm_mixed_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mm_views_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mul_softmax_symfloat_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_multi_device_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_multilayer_any_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_multilayer_sum_low_prec_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mutable_custom_op_fixed_layout2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mutations_loop_fusion_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_nan_sort_stable_True_descending_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_nan_sort_stable_True_descending_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_needs_contiguous_strides_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_neg_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_neg_max_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_new_empty_strided_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_new_ones_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_nll_loss_backward_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_nll_loss_forward_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_norm_constant_overflow_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_one_hot_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pad_cast_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pad_single_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pad_view_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_bessel_j0_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_bessel_y0_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_chebyshev_polynomial_u_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_chebyshev_polynomial_w_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_erf_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_erfc_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_exp2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_expm1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_gammaincc_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_hermite_polynomial_h_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_i0_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_i0e_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_i1e_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_logit_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_modified_bessel_i0_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_multigammaln_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_ndtr_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_polygamma_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_scaled_modified_bessel_k1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_shifted_chebyshev_polynomial_v_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_sinc_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_spherical_bessel_j0_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_xlogy_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_zeta_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pow3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pow_int_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_prepare_softmax_with_fast_math_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_rand_like_deterministic_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_randint_distribution_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_randn_with_dtype_and_device_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_reduction1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_reduction2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_reduction3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_reduction4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_reduction5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_reduction_config_limit_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_relu_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_remainder_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_remove_no_ops_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_remove_noop_slice1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_remove_noop_slice_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_remove_noop_view_default_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_repeat_as_strided_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_repeat_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_repeat_interleave_Tensor_decomp_int32_nd_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_repeat_interleave_Tensor_decomp_int64_nd_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_resize_as_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_resize_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_roi_align_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_round_correctness_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_rsqrt_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scalar_input_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scaled_dot_product_efficient_attention_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter6_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter_add3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter_bf16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter_reduce3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scheduler_vertical_fusion1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sdpa_prefer_nd_tiling_True_use_block_ptr_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sdpa_prefer_nd_tiling_True_use_block_ptr_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sdpa_unaligned_mask_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_setitem_with_int_parameter_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sgn_extremal_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_should_pad_bench_for_bmm_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sigmoid_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sign_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sin_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_single_elem_indirect_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_size_asserts_for_multi_output_fallback_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_mutation1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_mutation2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_scatter_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_softmax_backward_data_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_softmax_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_softmax_one_kernel_loop_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_softmax_one_kernel_persist_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sort_stable_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_cumsum_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_squeeze1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_squeeze2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_squeeze_varargs_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sum3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sum4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sum_int_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_tan_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_tanh_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_tensor1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_tensor2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_tensor_index_put_slice_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_tmp_not_defined_issue3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_to_device_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_to_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_transpose_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_transposed_propagates_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_triu_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_uint_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unbacked_floordiv_simplify_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unbacked_floordiv_simplify_errors_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unbind_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unspec_inputs_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unspec_inputs_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unspec_inputs_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unspec_inputs_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unsqueeze_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unsqueeze_inplace_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_upsample_bicubic2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_upsample_nearest1d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_upsample_nearest2d_backward_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_var_correction_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_var_mean_tile_reduction_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_var_mean_tile_reduction_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_vectorized_ops_masked_var_novec_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_view_as_complex_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_view_detach_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_views2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_views5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_views6_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_views7_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_weight_norm_bwd_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_where_broadcast_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_where_with_logical_op_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_xblock_divides_xnumel_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_AllenaiLongformerBase_repro_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test__dyn_quant_pack_4bit_weight_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test__unsafe_masked_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test__unsafe_masked_index_put_accumulate_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_adaptive_avg_pool_errors_with_long_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_adaptive_avg_pool_with_output_size_0_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_adaptive_max_pool2d1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_adaptive_max_pool2d3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_adaptive_pool_errors_with_long_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_add_complex3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_add_complex4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_add_complex5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_add_complex_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_add_inplace_permuted_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_alexnet_prefix_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_any_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_aoti_eager_cache_hit_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_aoti_eager_dtype_device_layout_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_aoti_eager_support_out_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_arange1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_arange2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_arange3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_arange4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_arange6_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_argmax_argmin3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d_backward4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool3d_backward2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool3d_backward3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bfloat16_to_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bitwise_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bmm1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bmm2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_default_kwargs_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int16_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int16_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int32_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int64_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int64_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int64_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int8_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int8_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int8_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_uint8_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_uint8_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_nd_tiling_False_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_buffer_batch_norm_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_buffer_use_after_remove_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_builtins_round_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_builtins_round_float_ndigits_pos_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_builtins_round_float_ndigits_zero_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_builtins_round_int_ndigits_zero_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cat_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cat_extern_kernel_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cat_negative_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cat_of_loops_and_extern_kernel_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cat_single_empty_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cat_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cat_unbacked_2d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cat_unbacked_legacy_empty_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cauchy_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_check_stack_no_cycles_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_chunk_recompiles_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_clamp_type_promotion_non_tensor_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_complex_fallback_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_computed_buffer_inlining_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_concat_add_inplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_config_option_dont_assume_alignment_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_config_option_dont_assume_alignment_recompiles_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_consecutive_split_cumsum_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_constant_pad_1d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_constant_pad_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_constant_pad_nd_inplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_conv2d_channels_last_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_conv3d_channels_last_use_block_ptr_False_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_conv3d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_conv_bn_fuse_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_conv_inference_heuristics_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_conv_shape_check_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_conv_with_as_strided_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_convolution1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_convolution2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_convolution4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_copy_non_blocking_is_pinned_use_cat_False_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_copy_non_blocking_is_pinned_use_cat_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cummin_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cumprod_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_custom_op_2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_custom_op_fixed_layout_sequential_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_custom_scan_op_compiled_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_custom_scan_op_multi_input_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_custom_scan_would_split_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dense_mask_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_deterministic_codegen_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_deterministic_codegen_on_graph_break_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_deterministic_codegen_with_suffix_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_diagonal_copy_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dist_bf16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dist_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div6_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div7_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div_by_zero_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div_presicion_accuracy_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div_prim_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dropout_trivial_0_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dropout_trivial_1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtype_mismatch_issue_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_bfloat16_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_bfloat16_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_bfloat16_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_bfloat16_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_bfloat16_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_bfloat16_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float16_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float16_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float16_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float16_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float16_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float16_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float16_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float32_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float32_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float32_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float32_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float64_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float64_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_fusion_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int16_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int16_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int16_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int16_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int16_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int32_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int32_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int32_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int32_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int32_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int32_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int64_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int64_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int64_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int64_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int8_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int8_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int8_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int8_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int8_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_uint8_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_uint8_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_uint8_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_uint8_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_uint8_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_uint8_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_uint8_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_elu_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_embedding_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_embedding_sparse_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_empty1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_empty2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_empty_strided_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_erfinv_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_exp2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fallback_mutable_op_list_tensor_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fallback_mutable_op_no_mutated_tensors_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fallback_mutable_op_with_return_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fft_real_input_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fft_real_input_real_output_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fill1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_float_index_expression_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_float_index_expression_type_promotion_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_floordiv_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fmin_fmax_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fmod_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fractional_max_pool2d2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fractional_max_pool2d4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fuse_large_params_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fusing_write_into_disjoint_read_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_gather2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_gather3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_gelu_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_generate_rand_fp8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_generated_code_has_alignment_assert_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_getitem_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_argmax_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_misaligned_input_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_no_inputs_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_refcount_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_unbacked_symint_as_output_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_grid_sampler_2d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_hardsigmoid_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_hardswish_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_horizonal_fusion2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_propagation_device_assert_masked_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_propagation_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_propagation_flip_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_propagation_floordiv_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_propagation_remainder_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put_deterministic_fallback_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put_fallback1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put_fallback2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_select_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_tensor_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inductor_multiple_specializations_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inductor_triton_bucketize_respects_masking_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inf_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inner_fn_str_and_stride_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inplace_add_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inplace_mixed_dtype_ops_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_int_input_dynamic_shapes_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_invalid_operand_issue1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_isinf2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_issue102546_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_kernel_names_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_l1_loss_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_large_broadcast_reduction_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_large_grid_use_block_ptr_False_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_large_grid_use_block_ptr_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_large_pointwise_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_large_strided_reduction_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_large_tensor_reduction_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_lerp_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_lgamma_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_like_rands2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_like_rands_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_linear1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_linear2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_linear_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_linear_mixed_dtype_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_linspace2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_list_clearing_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_log2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_logaddexp_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_low_memory_max_pool_dilation_1_dim_2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_low_memory_max_pool_dilation_1_dim_3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_low_memory_max_pool_dilation_2_dim_2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_low_memory_max_pool_dilation_2_dim_3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_masked_scatter_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_matmul_layer_norm_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_min_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d6_dilation_1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d6_dilation_2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d_with_indices_backward3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d_with_indices_backward6_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d_with_indices_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mean_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mixed_mm3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mm_mixed_dtype_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mul_index_expr_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_multi_gpu_device_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_multi_threading_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_multilayer_var_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mutable_custom_op_fixed_layout2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_nan_sort_stable_False_descending_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_nan_sort_stable_True_descending_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_needs_contiguous_strides_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_new_empty_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_new_empty_strided_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_nll_loss_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_nll_loss_forward_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_no_specization_over_symbolic_value_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pattern_matcher_multi_user_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_permute1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pixel_shuffle_channels_last_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_bessel_j0_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_bessel_j1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_bessel_y0_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_chebyshev_polynomial_t_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_chebyshev_polynomial_w_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_erfc_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_erfinv_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_expit_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_expm1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_gammaincc_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_gammaln_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_hermite_polynomial_h_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_i0_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_i1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_i1e_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_log1p_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_log_ndtr_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_logit_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_modified_bessel_k0_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_multigammaln_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_psi_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_round_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_scaled_modified_bessel_k0_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_shifted_chebyshev_polynomial_t_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_shifted_chebyshev_polynomial_v_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_shifted_chebyshev_polynomial_w_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_xlog1py_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_zeta_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_polar_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pow1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pow2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pow3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pow_by_natural_log2_dynamic_shapes_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pow_int_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pow_symfloat_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_prepare_softmax_with_fast_math_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_rand_like_deterministic_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_randn_generator_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_randn_like_empty_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_randn_with_dtype_and_device_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_reduction1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_reduction2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_reflection_pad2d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_relu_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_remove_no_ops_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_remove_noop_view_dtype_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_repeat_as_strided_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_repeat_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_repeat_interleave_2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_repeat_interleave_Tensor_decomp_int32_nd_1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_repeat_interleave_Tensor_decomp_int32_nd_2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_require_stride_expanded_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_resize_as_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_resize_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_reuse_buffers_with_aliasing_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_roll_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_rsqrt_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scalar_cpu_tensor_arg_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scalar_input_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scalar_output_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scaled_dot_product_efficient_attention_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter_add2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter_add3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter_reduce1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scheduler_vertical_fusion1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sdpa_prefer_nd_tiling_True_use_block_ptr_False_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sdpa_unaligned_mask_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_searchsorted_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_select_scatter_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sgn_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sgn_extremal_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_shape_prop_torch_ones_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_silu_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sin_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_single_elem_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_single_elem_indirect_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_size_asserts_for_multi_output_fallback_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sizehint_issue1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice_mutation2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice_mutation3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice_scatter2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice_scatter_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice_scatter_reinplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_softmax_backward_data_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_softmax_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_softmax_one_kernel_persist_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sort_bool_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sort_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sort_stable_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_cumprod_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_cumprod_low_prec_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_cumsum_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_cumsum_low_prec_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_failed_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_with_integer_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sqrt_dynamic_shapes_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_squeeze1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_squeeze2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sum1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sum3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sum4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sum5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_tanh_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_tensor_index_slice_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_tmp_not_defined_issue1_use_block_ptr_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_tmp_not_defined_issue2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_tmp_not_defined_issue3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_to_dtype_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_transpose_add_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_transpose_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_triton_kernel_bool_param_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_triu_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_uint4x2_mixed_mm_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unbacked_floordiv_simplify_errors_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unfold_zero_dimension_tensor_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unroll_small_reduction_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unspec_inputs_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unspec_inputs_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unspec_inputs_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unspec_inputs_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unspec_inputs_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unspec_inputs_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unsqueeze_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unsqueeze_inplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_upsample_bilinear2d_a_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_upsample_nearest2d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_var_correction_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_var_mean_tile_reduction_False_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_vectorized_ops_masked_var_novec_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_view_as_complex_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_view_as_real_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_view_detach_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_view_on_aliased_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_view_uint8_through_differing_bitwidths_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_views2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_views3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_views4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_views7_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_weight_norm_bwd_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_where_with_logical_op_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_zero_dim_reductions_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_zeros_dynamic_shapes_cuda 2025-08-14T21:59:09.9594213Z 2025-08-14T21:59:13.8639005Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T21:59:13.8640641Z import pkg_resources 2025-08-14T21:59:14.2356189Z Running inductor/test_torchinductor_opinfo 2/11 ... [2025-08-14 21:59:14.235191] 2025-08-14T21:59:14.2356729Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T21:59:14.2359415Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'not serial', '--shard-id=2', '--num-shards=11', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 21:59:14.235594] 2025-08-14T22:00:39.6928918Z 2025-08-14T22:00:39.6932557Z inductor/test_torchinductor_codegen_dynamic_shapes 1/2 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_codegen_dynamic_shapes_1.2_6e921a24d544b630_.log 2025-08-14T22:00:39.7367661Z Running 842 items in this shard: test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_abs_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_adaptive_avg_pool1d_argmax_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_adaptive_avg_pool_with_output_size_0_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_adaptive_max_pool2d2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_adaptive_pool_errors_with_long_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_add_complex3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_add_complex6_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_add_const_float_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_addmv_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_aliased_buffer_reuse_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_aoti_eager_support_out_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_aoti_eager_with_scalar_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_arange2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_arange3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_arange4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_arange5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_arange6_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_argmax_argmin3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_argmax_argmin_with_duplicates_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_argmax_min_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_as_strided_scatter_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_assert_alignment_op_name_fail_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_assert_alignment_op_name_pass_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_assert_size_stride_op_name_fail_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_assert_size_stride_op_name_pass_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d_backward2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d_backward3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d_backward_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool3d_backward2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool3d_backward3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool3d_backward4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool3d_backward_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool_errors_with_uint_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_baddbmm_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_batch_norm_2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bernoulli1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bernoulli2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bfloat16_to_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bmm2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_both_scalars_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_computed_offsets_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_default_kwargs_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int16_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int16_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int32_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int32_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int64_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int64_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int64_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int8_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int8_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int8_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int8_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_uint8_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_uint8_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_uint8_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_nd_tiling_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_buffer_copied_in_graph_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_builtins_round_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_builtins_round_float_ndigits_zero_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_builtins_round_int_ndigits_pos_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_builtins_round_int_ndigits_zero_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_empty_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_inplace_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_negative_dim_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_of_loops_and_extern_kernel_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_unbacked_2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_unbacked_empty_1d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_upcasting_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_chunk_recompiles_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_clamp_type_promotion_non_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_compar_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_complex_memory_overlap_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_config_option_dont_assume_alignment_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_config_option_dont_assume_alignment_recompiles_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_consecutive_split_cumprod_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_const_int32_to_float_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_constant_pad_2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_constant_pad_fill_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_constant_pad_nd_inplace_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_conv2d_channels_last_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_conv3d_channels_last_use_block_ptr_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_conv3d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_conv_bn_fuse_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_conv_functional_bn_fuse_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_convolution4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_copy_non_blocking_is_pinned_use_cat_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cos_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cudnn_rnn_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cumsum_no_mask_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cumsum_pattern_matcher_issue_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_custom_op_default_layout_constraint_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_custom_op_fixed_layout_sequential_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_custom_scan_would_split_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_data_type_propogation_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_deterministic_codegen_with_suffix_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_device_assert_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_diagonal_copy_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dist_bf16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div6_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div7_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div9_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div_by_zero_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div_zero_dim_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dropout_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dropout_trivial_1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtype_mismatch_issue_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_bfloat16_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_bfloat16_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_bfloat16_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_bfloat16_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float16_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float16_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float16_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float32_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float32_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float32_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float32_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float32_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float32_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float32_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float64_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float64_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float64_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float64_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int16_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int16_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int16_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int16_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int32_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int32_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int32_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int32_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int32_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int32_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int64_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int64_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int64_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int8_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int8_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int8_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int8_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int8_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int8_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int8_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_uint8_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_uint8_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_embedding_bag_byte_unpack_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_embedding_bag_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_empty_strided_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_exact_stride_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_exp2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_exp_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_expand_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fallback_mutable_op_basic_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fallback_mutable_op_list_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fallback_mutable_op_list_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fft_real_input_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_flip_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_float32_to_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fmin_fmax_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fractional_max_pool2d1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fractional_max_pool2d2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_full_like_sliced_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_full_like_transposed_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_functionalize_rng_wrappers_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fuse_tiled_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_gather1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_gather2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_gather_scatter_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_gelu_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_generated_code_has_alignment_assert_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_graph_partition_constant_tensor2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_graph_partition_mutation_real_name_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_graph_partition_no_inputs_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_graph_partition_unbacked_symint_as_output_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_grid_sampler_2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_hardtanh_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_horizonal_fusion1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_horizonal_fusion2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_dynamic_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_propagation_abs_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_propagation_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_propagation_flip_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_put2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_put4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_put_as_masked_fill_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_put_failed_reinplace_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_put_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_select_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_indirect_load_broadcast_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_inductor_multiple_specializations_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_inner_fn_str_and_stride_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_input_mutation2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_input_mutation5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_int8_weight_only_quant_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_invalid_operand_issue1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_isin_tensor_scalar_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_isinf2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_l1_loss_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_large_grid_use_block_ptr_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_large_pointwise_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_lerp_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_lgamma_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_like_channels_last_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_like_rands_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_like_rands_sliced_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_linear1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_linear_dynamic_maxautotune_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_linspace1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_linspace3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_linspace4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_logaddexp_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_logcumsumexp_zero_dim_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_logsumexp_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_long_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_low_memory_max_pool_dilation_1_dim_3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_masked_fill_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_masked_fill_promotion_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_masked_scatter_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_matmul_layer_norm_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_min_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d7_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d_with_indices_backward2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d_with_indices_backward4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d_with_indices_backward_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mean_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_misaligned_address_issue1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mix_device_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mixed_mm_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_move_arange_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mul_index_expr_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_multi_gpu_device_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_multi_gpu_recompile_on_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_multi_threading_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_multilayer_prime_size_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_multilayer_var_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_multilayer_var_lowp_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mutable_custom_op_fixed_layout_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_nan_sort_stable_False_descending_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_nan_sort_stable_False_descending_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_nan_to_num_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_narrow_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_new_empty_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_no_mega_fusion_during_lowering_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_no_op_reduction_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_no_specization_over_symbolic_value_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_nonzero_unbacked_refinement_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_output_strides_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pattern_matcher_multi_user_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_permute1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_permute2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_philox_rand_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pixel_shuffle_channels_last_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_airy_ai_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_bessel_j1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_bessel_y1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_chebyshev_polynomial_t_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_chebyshev_polynomial_v_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_digamma_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_entr_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_erfcx_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_erfinv_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_expit_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_gammainc_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_gammaln_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_hermite_polynomial_he_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_i1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_laguerre_polynomial_l_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_legendre_polynomial_p_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_log1p_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_log_ndtr_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_modified_bessel_i1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_modified_bessel_k0_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_modified_bessel_k1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_ndtri_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_psi_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_round_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_scaled_modified_bessel_k0_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_shifted_chebyshev_polynomial_t_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_shifted_chebyshev_polynomial_u_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_shifted_chebyshev_polynomial_w_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_xlog1py_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_polar_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pow1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pow2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pow_by_natural_log2_dynamic_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pow_symfloat_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_prod_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_profiler_mark_wrapper_call_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_randint_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_randint_int64_mod_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_randint_kernel_count_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_randn_generator_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_randn_like_empty_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_reflection_pad2d_backward_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_reflection_pad2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_reinterpret_dtypeview_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_remove_noop_clone_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_remove_noop_copy_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_remove_noop_slice_scatter_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_remove_noop_view_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_repeat_interleave_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_repeat_interleave_Tensor_decomp_int32_nd_1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_repeat_interleave_Tensor_decomp_int64_nd_1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_repeat_interleave_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_replication_pad_errors_with_bool_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_require_stride_expanded_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_reuse_buffers_with_aliasing_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_roll_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_round_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_rsqrt_dynamic_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scalar_cpu_tensor_arg_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scalar_output_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scaled_dot_product_attention_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter_add1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter_add2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter_reduce1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter_reduce2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sdpa_prefer_nd_tiling_False_use_block_ptr_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sdpa_prefer_nd_tiling_False_use_block_ptr_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sdpa_unaligned_mask_freezing_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_searchsorted_broadcast_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_searchsorted_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_select_scatter_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sgn_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_shape_padding_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_shape_prop_torch_ones_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_signbit_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_silu_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_simplify_loops_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_single_elem_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sizehint_issue1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_mutation3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_scatter2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_scatter3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_scatter4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_scatter5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_scatter_reinplace_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_view_with_graph_break_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sort_bool_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sort_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sort_transpose_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_special_polygamma_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_cumprod_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_cumprod_low_prec_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_cumsum_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_cumsum_low_prec_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_failed_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_reduction_dynamic_shape_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_reduction_with_int64_size_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_with_integer_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_with_list_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_with_sizes_with_unbacked_symints_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_with_unbacked_symints_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sqrt_dynamic_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_stack_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_std_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_stride_preservation_with_stride_modifying_fx_pass_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_strided_inputs_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sum1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sum2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sum5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sum_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sum_keepdims_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_tensor3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_tensor_index_slice_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_tmp_not_defined_issue1_use_block_ptr_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_tmp_not_defined_issue2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_to_device_constant_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_to_memory_format_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_topk_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_transpose_add_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_triton_kernel_bool_param_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_uint4x2_mixed_mm_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unfold_zero_dimension_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unroll_small_reduction_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unspec_inputs_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unspec_inputs_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unspec_inputs_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unspec_inputs_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unspec_inputs_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_upsample_bilinear2d_a_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_upsample_bilinear2d_b_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_upsample_cat_conv_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_upsample_nearest2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_upsample_nearest3d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_var_mean_div_by_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_vdd_clamp_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_vectorized_ops_masked_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_vertical_fusion1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_view_as_real_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_view_on_aliased_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_view_uint8_through_differing_bitwidths_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_views1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_views3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_views4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_zero_dim_reductions_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_zero_element_mutation_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_zeros_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test__dyn_quant_matmul_4bit_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_abs_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_adaptive_avg_pool1d_argmax_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_adaptive_avg_pool2d1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_adaptive_avg_pool2d2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_adaptive_avg_pool2d_low_prec_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_adaptive_max_pool2d2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_add_complex6_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_add_const_float_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_add_const_int_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_adding_tensor_offsets_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_addmm_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_addmv_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_aliased_buffer_reuse_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_angle_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_aoti_eager_override_registration_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_aoti_eager_support_str_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_aoti_eager_with_persistent_cache_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_aoti_eager_with_scalar_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_arange5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_argmax_argmin1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_argmax_argmin2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_argmax_argmin_with_duplicates_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_argmax_argmin_with_nan_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_argmax_min_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_argmax_to_float_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_as_strided_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_as_strided_scatter_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_assert_alignment_op_name_fail_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_assert_alignment_op_name_pass_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_assert_size_stride_op_name_fail_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_assert_size_stride_op_name_pass_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d6_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d7_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d_backward2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d_backward3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool3d_backward4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool3d_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool_errors_with_uint_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_baddbmm_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_batch_norm_2d_2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_batch_norm_2d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bernoulli1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bernoulli2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bitwise2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bitwise3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bool_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_both_scalars_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_add_autotune_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_broadcast_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_computed_offsets_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int16_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int16_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int16_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int32_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int32_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int32_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int32_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int64_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int64_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int8_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int8_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_uint8_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_uint8_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_uint8_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_nd_tiling_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_buffer_copied_in_graph_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_buffer_copied_in_graph_with_different_shapes_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_builtins_round_float_ndigits_neg_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_builtins_round_int_ndigits_pos_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cat_empty_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cat_empty_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cat_inplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cat_unbacked_empty_1d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cat_upcasting_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_clamp_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_clamp_type_promotion_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_clone_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_compar_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_complex_memory_overlap_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_config_option_dont_assume_alignment_cudagraphs_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_consecutive_split_cumprod_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_const_int32_to_float_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_constant_pad_2d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_constant_pad_2d_strides_nonpositive_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_constant_pad_3d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_constant_pad_fill_dtype_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_conv2d_backward_channels_last_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_conv3d_channels_last_use_block_ptr_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_conv_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_conv_functional_bn_fuse_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_convolution3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_convolution5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cos_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cudnn_rnn_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cumsum_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cumsum_inf_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cumsum_no_mask_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cumsum_pattern_matcher_issue_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cumsum_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_custom_op_1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_custom_op_3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_custom_op_default_layout_constraint_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_custom_op_fixed_layout_channels_last_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_custom_op_unbacked_symints_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_custom_scan_op_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_data_type_propogation_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_device_assert_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div9_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div_precision_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div_softmax_symfloat_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dont_constant_fold_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dropout2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dropout3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dropout_deterministic_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dropout_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtype_sympy_expr_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_bfloat16_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_bfloat16_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_bfloat16_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float16_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float16_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float32_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float32_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float32_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float32_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float32_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float64_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float64_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float64_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float64_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float64_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float64_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float64_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int16_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int16_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int16_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int16_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int32_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int32_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int32_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int64_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int64_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int64_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int64_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int64_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int8_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int8_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int8_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int8_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_uint8_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_uint8_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_embedding_bag_byte_unpack_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_embedding_bag_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_erfc_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_exact_stride_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_exp_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_expand_as_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_expand_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_expanded_reduction_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_expm1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fallback_mutable_op_basic_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fallback_mutable_op_list_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fill2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_flip_cat_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_flip_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_float16_to_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_float32_to_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_float_repr_dynamic_shapes_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fmod_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_forced_buffer_realize_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fractional_max_pool2d1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fractional_max_pool2d3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fractional_max_pool2d5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_full_boolean_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_full_like_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_full_like_sliced_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_full_like_transposed_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_full_truncation_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_functionalize_rng_wrappers_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fuse_tiled_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_gather1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_gather_scatter_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_generated_code_has_size_stride_assert_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_glu_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_arange1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_arange2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_both_scalars_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_constant_tensor1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_constant_tensor2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_mutation_real_name_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_pad_dynamic_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_scalar_inputs_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_hardtanh_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_horizonal_fusion1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_dynamic_shapes_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_propagation_abs_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_propagation_nested_indirect_indexing_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put_as_masked_fill_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put_failed_reinplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put_reinplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_remainder_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_indirect_load_broadcast_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inductor_assert_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inductor_layout_optimization_input_mutations_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inplace_activations_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inplace_resize_as_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inplace_where_pointwise_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_input_mutation1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_input_mutation2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_input_mutation3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_input_mutation4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_input_mutation5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_insignificant_strides_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_int8_weight_only_quant_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_isin_tensor_scalar_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_isinf_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_kwargs_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_large_offset_pointwise_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_layer_norm_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_leaky_relu_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_like_channels_last_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_like_rands3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_like_rands_sliced_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_linear_dynamic_maxautotune_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_linspace1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_linspace3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_linspace4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_log1p_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_log_fp64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_log_softmax_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_logcumsumexp_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_logcumsumexp_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_logsumexp_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_long_tensor_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_masked_fill_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_masked_fill_promotion_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d7_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d_with_indices_backward2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d_with_indices_backward4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d_with_indices_backward5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_min_max_reduction_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_min_max_reduction_nan_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_misaligned_address_issue1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mix_device_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mixed_mm2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mixed_mm_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mm_views_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_move_arange_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mul_softmax_symfloat_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_multi_device_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_multi_gpu_recompile_on_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_multilayer_any_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_multilayer_prime_size_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_multilayer_sum_low_prec_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_multilayer_var_lowp_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mutable_custom_op_fixed_layout_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mutations_loop_fusion_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_nan_sort_stable_False_descending_False_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_nan_sort_stable_True_descending_False_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_nan_to_num_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_narrow_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_neg_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_neg_max_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_new_ones_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_no_mega_fusion_during_lowering_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_no_op_reduction_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_nonzero_unbacked_refinement_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_norm_constant_overflow_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_one_hot_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_output_strides_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pad_cast_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pad_single_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pad_view_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_permute2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_philox_rand_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_airy_ai_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_bessel_y1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_chebyshev_polynomial_u_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_chebyshev_polynomial_v_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_digamma_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_entr_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_erf_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_erfcx_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_exp2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_gammainc_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_hermite_polynomial_he_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_i0e_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_laguerre_polynomial_l_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_legendre_polynomial_p_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_modified_bessel_i0_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_modified_bessel_i1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_modified_bessel_k1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_ndtr_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_ndtri_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_polygamma_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_scaled_modified_bessel_k1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_shifted_chebyshev_polynomial_u_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_sinc_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_spherical_bessel_j0_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_xlogy_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_prod_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_profiler_mark_wrapper_call_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_randint_distribution_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_randint_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_randint_int64_mod_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_randint_kernel_count_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_reduction3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_reduction4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_reduction5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_reduction_config_limit_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_reflection_pad2d_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_reinterpret_dtypeview_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_remainder_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_remove_noop_clone_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_remove_noop_copy_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_remove_noop_slice1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_remove_noop_slice_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_remove_noop_slice_scatter_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_remove_noop_view_default_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_repeat_interleave_Tensor_decomp_int64_nd_1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_repeat_interleave_Tensor_decomp_int64_nd_2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_repeat_interleave_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_replication_pad_errors_with_bool_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_roi_align_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_round_correctness_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_round_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_rsqrt_dynamic_shapes_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scaled_dot_product_attention_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter6_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter_add1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter_bf16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter_reduce2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter_reduce3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sdpa_prefer_nd_tiling_False_use_block_ptr_False_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sdpa_prefer_nd_tiling_False_use_block_ptr_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sdpa_prefer_nd_tiling_True_use_block_ptr_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sdpa_unaligned_mask_freezing_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_searchsorted_broadcast_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_setitem_with_int_parameter_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_shape_padding_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_should_pad_bench_for_bmm_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sigmoid_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sign_dtype_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_signbit_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_simplify_loops_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice_mutation1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice_scatter3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice_scatter4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice_scatter5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice_view_with_graph_break_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_softmax_one_kernel_loop_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sort_transpose_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_special_polygamma_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_cumsum_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_reduction_dynamic_shape_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_reduction_with_int64_size_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_with_list_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_with_sizes_with_unbacked_symints_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_with_unbacked_symints_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_squeeze_varargs_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_stack_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_std_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_stride_preservation_with_stride_modifying_fx_pass_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_strided_inputs_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sum2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sum_dtype_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sum_int_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sum_keepdims_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_tan_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_tensor1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_tensor2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_tensor3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_tensor_index_put_slice_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_to_device_constant_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_to_device_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_to_memory_format_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_topk_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_transposed_propagates_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_uint_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unbacked_floordiv_simplify_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unbind_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unspec_inputs_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unspec_inputs_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unspec_inputs_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_upsample_bicubic2d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_upsample_bilinear2d_b_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_upsample_cat_conv_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_upsample_nearest1d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_upsample_nearest2d_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_upsample_nearest3d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_var_mean_div_by_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_var_mean_tile_reduction_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_vdd_clamp_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_vectorized_ops_masked_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_vertical_fusion1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_views1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_views5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_views6_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_where_broadcast_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_xblock_divides_xnumel_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_zero_element_mutation_dynamic_shapes_cuda 2025-08-14T22:00:39.7787134Z 2025-08-14T22:00:43.8667093Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:00:43.8668776Z import pkg_resources 2025-08-14T22:00:44.2454789Z Running inductor/test_torchinductor_opinfo 5/11 ... [2025-08-14 22:00:44.244924] 2025-08-14T22:00:44.2455280Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:00:44.2456706Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'not serial', '--shard-id=5', '--num-shards=11', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:00:44.245329] 2025-08-14T22:02:06.7599837Z 2025-08-14T22:02:06.7603614Z inductor/test_torchinductor_opinfo 5/11 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_5.11_5f480f72202a5622_.log 2025-08-14T22:02:06.7739829Z Running 322 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___getitem___cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmul___cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rsub___cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rsub___cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rxor___cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__segment_reduce_offsets_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__softmax_backward_data_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__upsample_bilinear2d_aa_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addcmul_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addcmul_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addr_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_allclose_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amax_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amax_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argsort_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_copy_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_partial_views_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asin_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asin_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atanh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bernoulli_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bincount_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_left_shift_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_xor_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_block_diag_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_block_diag_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bmm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bmm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cat_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cauchy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_chalf_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_char_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_chunk_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_combinations_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cos_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cosh_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_count_nonzero_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cov_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumprod_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumulative_trapezoid_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diag_embed_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagflat_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_digamma_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_floor_rounding_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_floor_rounding_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_no_rounding_mode_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_trunc_rounding_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dsplit_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dsplit_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_einsum_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_einsum_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_like_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_permuted_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_strided_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erf_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp2_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eye_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftshift_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfft_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft2_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftn_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfftn_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfftn_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fill_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flatten_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flipud_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_power_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_power_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_power_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmax_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmin_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmin_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmod_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_frexp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_like_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gather_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gcd_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ge_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_geometric_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_geometric_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gradient_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_grid_sampler_2d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_half_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_heaviside_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hsplit_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hstack_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_add_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_mean_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_prod_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_select_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isclose_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isnan_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isposinf_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isreal_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isreal_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_item_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_unary_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_unary_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_kthvalue_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lcm_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ldexp_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lgamma_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cholesky_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cholesky_ex_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_diagonal_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_lstsq_grad_oriented_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_tensorinv_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_tensorsolve_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_with_dtype_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logaddexp_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logcumsumexp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_and_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_and_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logspace_tensor_overload_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_long_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lu_solve_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mT_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mT_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amax_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amax_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amin_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumprod_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumprod_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumsum_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumsum_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_logaddexp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_prod_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_std_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_sum_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_var_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_binary_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_binary_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_pool2d_with_indices_backward_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_no_dim_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_median_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_no_dim_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_no_dim_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_msort_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_msort_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mul_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_multinomial_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nanmedian_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nanquantile_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nansum_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_neg_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_strided_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_strided_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_avg_pool2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_max_pool1d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_avg_pool1d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_avg_pool1d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv_transpose3d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_dropout_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_gaussian_nll_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_gelu_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_group_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardswish_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_area_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_area_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_bilinear_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_trilinear_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_trilinear_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_kl_div_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool1d_grad_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool2d_grad_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool2d_grad_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_circular_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_circular_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_constant_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_poisson_nll_loss_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_scaled_dot_product_attention_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_smooth_l1_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softshrink_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_tanhshrink_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_threshold_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_threshold_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_static_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_fro_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_inf_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_like_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_pca_lowrank_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_pinverse_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polar_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rand_like_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randint_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randint_like_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_real_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reciprocal_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reshape_as_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_neg_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rot90_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsub_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scalar_tensor_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_amax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_amin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_sum_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_select_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_select_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_short_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_hann_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinc_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_scatter_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_softmax_with_dtype_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_softmax_with_dtype_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sparse_sampled_addmm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_airy_ai_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_airy_ai_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_airy_ai_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_u_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i0e_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i0e_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1e_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_log_ndtr_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_i0_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtr_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtri_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_w_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_xlog1py_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_zeta_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_zeta_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_list_args_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_copy_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_square_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_std_unbiased_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_stft_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_svd_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_take_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tan_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tanh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tensor_split_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tensordot_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_topk_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_torch_ops_aten__flash_attention_forward_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trace_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trapezoid_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_indices_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_triu_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trunc_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unflatten_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_consecutive_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsafe_split_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsafe_split_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vdot_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_as_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_where_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_xlogy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zero__cuda_float16 2025-08-14T22:02:06.7874421Z 2025-08-14T22:02:10.8739817Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:02:10.8741217Z import pkg_resources 2025-08-14T22:02:11.2544822Z Running inductor/test_torchinductor_opinfo 6/11 ... [2025-08-14 22:02:11.253901] 2025-08-14T22:02:11.2545468Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:02:11.2548220Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'not serial', '--shard-id=6', '--num-shards=11', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:02:11.254342] 2025-08-14T22:03:57.1619421Z 2025-08-14T22:03:57.1620695Z inductor/test_torchinductor_dynamic_shapes 1/2 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_dynamic_shapes_1.2_8f848d6bbef141cc_.log 2025-08-14T22:03:57.2000132Z Running 841 items in this shard: test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_AllenaiLongformerBase_repro_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test__dyn_quant_matmul_4bit_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test__unsafe_masked_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_adaptive_avg_pool1d_argmax_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_adaptive_avg_pool2d1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_adaptive_avg_pool2d_low_prec_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_adaptive_avg_pool_with_output_size_0_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_adaptive_max_pool2d1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_adaptive_max_pool2d2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_adaptive_pool_errors_with_long_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_add_complex3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_add_complex4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_add_complex_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_add_const_float_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_add_const_int_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_add_inplace_permuted_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_addmm_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_addmv_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_alexnet_prefix_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_aliased_buffer_reuse_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_angle_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_any_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_aoti_eager_cache_hit_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_aoti_eager_dtype_device_layout_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_aoti_eager_with_persistent_cache_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_arange2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_arange5_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_arange6_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_argmax_argmin1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_argmax_argmin2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_argmax_argmin_with_duplicates_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_argmax_argmin_with_nan_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_argmax_min_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_argmax_to_float_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_avg_pool2d1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_avg_pool2d4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_avg_pool2d8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_avg_pool2d_backward2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_avg_pool2d_backward_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_avg_pool3d_backward2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_avg_pool3d_backward3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_avg_pool3d_backward4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_avg_pool_errors_with_uint_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_batch_norm_2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bernoulli1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bitwise2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bitwise_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bmm2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_broadcast_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int16_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int16_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int16_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int16_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int32_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int64_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int64_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int64_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int64_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int8_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int8_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int8_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int8_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_int8_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_uint8_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_bucketize_int_uint8_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_buffer_copied_in_graph_with_different_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_buffer_use_after_remove_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_builtins_round_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_builtins_round_float_ndigits_zero_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_builtins_round_int_ndigits_pos_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_builtins_round_int_ndigits_zero_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cat_empty_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cat_inplace_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cat_negative_dim_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cat_of_loops_and_extern_kernel_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cat_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cat_unbacked_2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cat_upcasting_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_chunk_recompiles_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_clamp_type_promotion_non_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_clone_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_compar_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_complex_memory_overlap_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_config_option_dont_assume_alignment_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_consecutive_split_cumprod_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_consecutive_split_cumsum_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_constant_pad_1d_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_constant_pad_2d_strides_nonpositive_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_constant_pad_fill_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_constant_pad_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_conv2d_backward_channels_last_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_conv3d_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_conv_bn_fuse_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_conv_shape_check_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_conv_with_as_strided_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_convolution5_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cudnn_rnn_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_cummin_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_custom_op_1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_custom_op_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_custom_op_3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_custom_op_default_layout_constraint_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_custom_scan_op_compiled_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_custom_scan_op_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_custom_scan_would_split_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_data_type_propogation_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dense_mask_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_deterministic_codegen_on_graph_break_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_deterministic_codegen_with_suffix_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_device_assert_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_diagonal_copy_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dist_bf16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dist_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div7_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div9_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div_precision_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div_softmax_symfloat_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_div_zero_dim_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dont_constant_fold_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dropout_deterministic_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dropout_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_bfloat16_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_bfloat16_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_bfloat16_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float16_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float16_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float16_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float16_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float16_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float16_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float16_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float16_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float32_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float32_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float32_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float32_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float64_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float64_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float64_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_float64_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int16_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int16_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int16_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int16_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int16_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int16_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int32_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int32_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int32_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int32_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int64_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int64_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int64_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int64_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int8_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int8_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int8_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int8_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int8_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_int8_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_uint8_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_uint8_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_uint8_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_dtypeview_uint8_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_embedding_bag_byte_unpack_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_embedding_bag_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_embedding_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_empty_strided_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_erfinv_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_exp_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_expanded_reduction_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fallback_mutable_op_basic_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fallback_mutable_op_list_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fill2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_float32_to_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_float_index_expression_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_float_index_expression_type_promotion_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_floordiv_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fmin_fmax_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fmod_zero_dim_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fractional_max_pool2d1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fractional_max_pool2d2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fractional_max_pool2d4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_full_like_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_full_truncation_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_functionalize_rng_wrappers_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_fusing_write_into_disjoint_read_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_gather1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_gather2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_gelu_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_generated_code_has_alignment_assert_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_graph_partition_argmax_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_graph_partition_constant_tensor1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_graph_partition_misaligned_input_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_graph_partition_mutation_real_name_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_graph_partition_no_inputs_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_graph_partition_pad_dynamic_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_graph_partition_scalar_inputs_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_graph_partition_unbacked_symint_as_output_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_hardsigmoid_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_hardswish_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_horizonal_fusion2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_dynamic_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_propagation_device_assert_masked_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_propagation_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_propagation_flip_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_propagation_nested_indirect_indexing_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_put1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_put3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_put_deterministic_fallback_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_put_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_remainder_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_select_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_index_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_inductor_multiple_specializations_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_inductor_triton_bucketize_respects_masking_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_inplace_activations_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_inplace_resize_as_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_input_mutation2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_input_mutation4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_int_input_dynamic_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_invalid_operand_issue1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_issue102546_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_large_grid_use_block_ptr_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_large_offset_pointwise_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_layer_norm_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_leaky_relu_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_lerp_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_lgamma_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_like_channels_last_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_like_rands_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_like_rands_sliced_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_linear_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_linear_mixed_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_linspace1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_linspace2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_log1p_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_log_softmax_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_logaddexp_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_logcumsumexp_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_logcumsumexp_zero_dim_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_logsumexp_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_low_memory_max_pool_dilation_1_dim_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_low_memory_max_pool_dilation_2_dim_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_low_memory_max_pool_dilation_2_dim_3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_masked_fill_promotion_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_masked_scatter_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_min_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d6_dilation_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d7_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d_with_indices_backward2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d_with_indices_backward3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d_with_indices_backward4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_max_pool2d_with_indices_backward6_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_min_max_reduction_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_misaligned_address_issue1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_mixed_mm2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_mixed_mm3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_mm_mixed_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_mm_views_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_mul_softmax_symfloat_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_multi_gpu_device_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_multi_threading_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_multilayer_var_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_multilayer_var_lowp_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_mutable_custom_op_fixed_layout_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_mutations_loop_fusion_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_nan_sort_stable_False_descending_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_nan_sort_stable_False_descending_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_nan_sort_stable_True_descending_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_nan_sort_stable_True_descending_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_needs_contiguous_strides_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_neg_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_nll_loss_forward_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pad_single_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pattern_matcher_multi_user_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_permute2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_philox_rand_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_bessel_j0_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_bessel_j1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_bessel_y0_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_chebyshev_polynomial_v_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_erfc_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_erfcx_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_erfinv_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_gammainc_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_hermite_polynomial_h_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_hermite_polynomial_he_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_laguerre_polynomial_l_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_legendre_polynomial_p_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_log1p_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_log_ndtr_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_logit_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_modified_bessel_i0_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_modified_bessel_i1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_modified_bessel_k1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_multigammaln_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_ndtri_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_polygamma_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_round_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_scaled_modified_bessel_k1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_shifted_chebyshev_polynomial_u_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_shifted_chebyshev_polynomial_w_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_sinc_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_xlog1py_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pointwise_zeta_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_polar_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pow3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pow_int_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_pow_symfloat_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_prepare_softmax_with_fast_math_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_profiler_mark_wrapper_call_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_rand_like_deterministic_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_randn_like_empty_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_randn_with_dtype_and_device_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_reduction1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_reduction2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_reduction4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_reinterpret_dtypeview_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_remainder_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_remove_no_ops_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_remove_noop_clone_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_remove_noop_copy_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_remove_noop_view_default_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_repeat_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_repeat_interleave_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_repeat_interleave_Tensor_decomp_int32_nd_1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_repeat_interleave_Tensor_decomp_int32_nd_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_repeat_interleave_Tensor_decomp_int64_nd_2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_repeat_interleave_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_replication_pad_errors_with_bool_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_require_stride_expanded_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_resize_as_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_reuse_buffers_with_aliasing_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_round_correctness_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_round_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scalar_input_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scalar_output_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scaled_dot_product_attention_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scaled_dot_product_efficient_attention_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scatter4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scatter5_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scatter_add1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scatter_add2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scatter_reduce1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scatter_reduce2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_scatter_reduce3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sdpa_prefer_nd_tiling_True_use_block_ptr_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sdpa_prefer_nd_tiling_True_use_block_ptr_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sdpa_unaligned_mask_freezing_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_searchsorted_broadcast_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sgn_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sgn_extremal_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_shape_prop_torch_ones_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_should_pad_bench_for_bmm_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sign_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sin_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_single_elem_indirect_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_size_asserts_for_multi_output_fallback_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_slice4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_slice_mutation1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_slice_mutation2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_slice_scatter3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_slice_scatter5_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_slice_scatter_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_softmax_backward_data_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sort_bool_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sort_stable_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_special_polygamma_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_split_cumsum_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_split_cumsum_low_prec_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_split_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_split_failed_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_split_with_integer_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_split_with_unbacked_symints_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sqrt_dynamic_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_squeeze1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_squeeze2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_squeeze_varargs_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_stack_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_std_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_stride_preservation_with_stride_modifying_fx_pass_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_strided_inputs_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sum1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sum4_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sum_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_sum_keepdims_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_tanh_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_tensor2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_tmp_not_defined_issue3_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_to_device_constant_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_topk_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_transpose_add_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_transposed_propagates_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_uint_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_unbacked_floordiv_simplify_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_unspec_inputs_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_unspec_inputs_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_unspec_inputs_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_unspec_inputs_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_unsqueeze_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_unsqueeze_inplace_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_upsample_bilinear2d_a_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_upsample_cat_conv_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_upsample_nearest2d_backward_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_var_mean_tile_reduction_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_vdd_clamp_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_vectorized_ops_masked_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_vectorized_ops_masked_var_novec_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_vertical_fusion1_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_view_on_aliased_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_view_uint8_through_differing_bitwidths_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_views2_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_views5_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_views7_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_where_broadcast_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_xblock_divides_xnumel_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesCpuTests::test_zero_dim_reductions_dynamic_shapes_cpu, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test__dyn_quant_matmul_4bit_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test__dyn_quant_pack_4bit_weight_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test__unsafe_masked_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_adaptive_avg_pool1d_argmax_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_adaptive_avg_pool2d1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_adaptive_avg_pool2d2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_adaptive_avg_pool_errors_with_long_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_adaptive_pool_errors_with_long_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_add_complex4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_add_complex5_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_add_complex_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_add_const_float_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_add_const_int_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_addmm_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_angle_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_aoti_eager_support_out_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_aoti_eager_with_persistent_cache_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_arange3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_arange4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_arange5_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_argmax_argmin1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_argmax_argmin_with_duplicates_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_argmax_min_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_assert_alignment_op_name_pass_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_assert_size_stride_op_name_fail_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_avg_pool2d6_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_avg_pool2d7_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_avg_pool2d_backward4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_avg_pool3d_backward2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_avg_pool3d_backward4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_avg_pool3d_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bfloat16_to_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bitwise2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bitwise_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bmm1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bool_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_both_scalars_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_add_autotune_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_broadcast_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_int16_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_int16_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_int16_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_int32_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_int32_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_int64_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_int64_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_int64_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_int8_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_int8_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_uint8_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_bucketize_int_uint8_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_buffer_copied_in_graph_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_buffer_copied_in_graph_with_different_shapes_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cat_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cat_empty_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cat_extern_kernel_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cat_negative_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cat_of_loops_and_extern_kernel_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cat_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cat_unbacked_2d_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cat_unbacked_empty_1d_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cat_unbacked_legacy_empty_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cauchy_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_chunk_recompiles_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_clamp_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_clamp_type_promotion_non_tensor_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_complex_memory_overlap_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_concat_add_inplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_constant_pad_1d_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_constant_pad_2d_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_constant_pad_2d_strides_nonpositive_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_constant_pad_3d_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_constant_pad_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_constant_pad_nd_inplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_conv2d_channels_last_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_conv3d_channels_last_use_block_ptr_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_conv3d_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_conv_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_conv_with_as_strided_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_convolution2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_convolution3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_convolution4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_copy_non_blocking_is_pinned_use_cat_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cos_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cummin_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cumprod_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cumsum_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cumsum_no_mask_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_cumsum_pattern_matcher_issue_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_custom_op_3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_custom_op_fixed_layout_sequential_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_custom_op_unbacked_symints_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_custom_scan_op_compiled_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_custom_scan_op_multi_input_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_custom_scan_would_split_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_data_type_propogation_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dense_mask_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_deterministic_codegen_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_deterministic_codegen_with_suffix_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_device_assert_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_diagonal_copy_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dist_bf16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_div1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_div3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_div5_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_div_prim_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_div_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dropout_deterministic_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dropout_trivial_1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_bfloat16_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_bfloat16_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_bfloat16_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_bfloat16_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_bfloat16_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float16_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float16_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float16_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float32_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float32_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float32_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float32_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float64_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float64_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float64_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float64_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float64_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_float64_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_fusion_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int16_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int16_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int16_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int16_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int16_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int32_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int32_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int32_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int32_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int32_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int64_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int64_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int64_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int64_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int8_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_int8_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_uint8_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_uint8_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_uint8_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_dtypeview_uint8_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_embedding_bag_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_embedding_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_empty_strided_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_erfc_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_exp_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fallback_mutable_op_list_tensor_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fallback_mutable_op_no_mutated_tensors_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fallback_mutable_op_with_return_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fft_real_input_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_float16_to_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_float32_to_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fmod_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fractional_max_pool2d1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fractional_max_pool2d4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_full_like_sliced_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_functionalize_rng_wrappers_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fuse_large_params_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fuse_tiled_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_fusing_write_into_disjoint_read_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_gather3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_gather_scatter_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_gelu_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_generated_code_has_alignment_assert_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_generated_code_has_size_stride_assert_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_graph_partition_argmax_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_graph_partition_both_scalars_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_graph_partition_constant_tensor2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_graph_partition_misaligned_input_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_graph_partition_mutation_real_name_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_graph_partition_pad_dynamic_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_hardsigmoid_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_hardtanh_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_horizonal_fusion1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_dynamic_shapes_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_propagation_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_propagation_flip_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_propagation_floordiv_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_put3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_put4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_put_as_masked_fill_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_put_deterministic_fallback_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_index_put_reinplace_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_inductor_assert_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_inductor_multiple_specializations_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_inductor_triton_bucketize_respects_masking_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_inplace_mixed_dtype_ops_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_inplace_resize_as_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_inplace_where_pointwise_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_input_mutation1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_input_mutation3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_insignificant_strides_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_int8_weight_only_quant_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_int_input_dynamic_shapes_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_invalid_operand_issue1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_isin_tensor_scalar_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_isinf2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_kernel_names_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_large_grid_use_block_ptr_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_large_offset_pointwise_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_large_tensor_reduction_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_layer_norm_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_lgamma_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_like_rands2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_like_rands3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_like_rands_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_linear2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_linear_dynamic_maxautotune_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_linspace1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_linspace2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_linspace3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_log2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_log_softmax_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_logcumsumexp_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_low_memory_max_pool_dilation_1_dim_2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_low_memory_max_pool_dilation_2_dim_2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_masked_fill_promotion_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_masked_scatter_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_matmul_layer_norm_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_max_pool2d1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_max_pool2d3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_max_pool2d4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_max_pool2d5_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_max_pool2d6_dilation_1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_max_pool2d7_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_max_pool2d_with_indices_backward2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_max_pool2d_with_indices_backward4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_max_pool2d_with_indices_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_min_max_reduction_nan_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_mixed_mm3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_mixed_mm_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_mm_views_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_mul_softmax_symfloat_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_multi_device_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_multi_gpu_recompile_on_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_multi_threading_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_multilayer_any_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_multilayer_sum_low_prec_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_multilayer_var_lowp_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_mutable_custom_op_fixed_layout_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_mutations_loop_fusion_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_nan_sort_stable_False_descending_False_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_nan_to_num_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_neg_max_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_new_empty_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_new_empty_strided_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pad_cast_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_philox_rand_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_airy_ai_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_bessel_j1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_bessel_y0_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_chebyshev_polynomial_u_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_expit_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_expm1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_gammainc_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_hermite_polynomial_he_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_i0_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_i1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_i1e_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_legendre_polynomial_p_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_logit_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_modified_bessel_k1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_ndtr_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_ndtri_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_polygamma_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_psi_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_scaled_modified_bessel_k1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_shifted_chebyshev_polynomial_v_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_sinc_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_spherical_bessel_j0_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pointwise_xlogy_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_polar_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pow1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pow3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_pow_symfloat_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_prepare_softmax_with_fast_math_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_randint_distribution_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_randint_kernel_count_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_randn_generator_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_randn_like_empty_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_randn_with_dtype_and_device_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_reduction1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_reduction2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_remainder_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_remove_noop_copy_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_remove_noop_slice_scatter_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_remove_noop_view_dtype_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_repeat_interleave_Tensor_decomp_int32_nd_1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_repeat_interleave_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_replication_pad_errors_with_bool_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_resize_as_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_resize_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_reuse_buffers_with_aliasing_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_roi_align_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_round_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_rsqrt_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_scalar_input_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_scatter4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_scatter6_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_scatter_add1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_scatter_reduce3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_scheduler_vertical_fusion1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sdpa_prefer_nd_tiling_True_use_block_ptr_False_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sdpa_prefer_nd_tiling_True_use_block_ptr_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sdpa_unaligned_mask_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_searchsorted_broadcast_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_setitem_with_int_parameter_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sgn_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sgn_extremal_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_should_pad_bench_for_bmm_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sigmoid_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sign_dtype_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_silu_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_simplify_loops_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sin_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_single_elem_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_size_asserts_for_multi_output_fallback_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sizehint_issue1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_slice2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_slice3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_slice_mutation1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_slice_scatter2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_slice_scatter_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_slice_view_with_graph_break_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_softmax_backward_data_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_softmax_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sort_bool_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sort_transpose_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_special_polygamma_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_split_cumprod_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_split_cumsum_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_split_cumsum_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_split_cumsum_low_prec_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_split_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_split_reduction_dynamic_shape_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_split_reduction_with_int64_size_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_squeeze1_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_squeeze2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_squeeze_varargs_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_std_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_strided_inputs_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sum3_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sum4_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sum5_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_sum_keepdims_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_tensor2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_tensor_index_put_slice_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_tmp_not_defined_issue2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_to_memory_format_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_triu_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_uint4x2_mixed_mm_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_uint_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_unfold_zero_dimension_tensor_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_unspec_inputs_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_unspec_inputs_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_unspec_inputs_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_unspec_inputs_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_unspec_inputs_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_unspec_inputs_int8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_unspec_inputs_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_unsqueeze_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_upsample_bicubic2d_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_upsample_nearest2d_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_var_mean_div_by_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_var_mean_tile_reduction_False_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_var_mean_tile_reduction_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_vdd_clamp_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_vectorized_ops_masked_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_view_as_real_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_view_detach_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_views2_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::DynamicShapesGPUTests::test_views5_dynamic_shapes_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_adaptive_max_pool3d_with_indices_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_arithmetic_constant_folding_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_bool_mask_nobreak_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_dynamic_stride_nobreak_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_float_is_integer_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_float_item_neginf_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_floor_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_full_symbolic_value_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_interpolate_ceil_eq_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_item_bool_nobreak_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_item_materialize_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_item_to_inputs_kernel_nobreak_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_math_ops_op1_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_math_ops_op2_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_math_ops_op3_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_math_ops_op7_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_math_ops_op8_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_nonzero_no_realloc_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_nonzero_size_factory_nobreak_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_pad_dynamic_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_shape_as_constant_reciprocal_float_exp_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_sort_dynamic_shape_with_check_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_sub_constant_folding_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_sym_sum_unbacked_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_unbacked_cat_backwards_save_data_dependent_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_unbacked_matmul_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_unbacked_reduction_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_unbacked_save_for_backwards_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_unspecialized_float_dynamic_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_unspecialized_float_fallback_specialization_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_unspecialized_float_fallback_symint_specialization_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_unspecialized_float_operations_cuda, test/inductor/test_torchinductor_dynamic_shapes.py::TestInductorDynamicCUDA::test_unspecialized_float_softshrink_cuda 2025-08-14T22:03:57.2402293Z 2025-08-14T22:04:01.3073011Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:04:01.3074366Z import pkg_resources 2025-08-14T22:04:01.6887656Z Running inductor/test_torchinductor_opinfo 8/11 ... [2025-08-14 22:04:01.688177] 2025-08-14T22:04:01.6888223Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:04:01.6889636Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '-m', 'not serial', '--shard-id=8', '--num-shards=11', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:04:01.688550] 2025-08-14T22:04:58.2719796Z 2025-08-14T22:04:58.2720668Z inductor/test_torchinductor_opinfo 2/11 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_2.11_2e7a133ca1cb6b1c_.log 2025-08-14T22:04:58.2986125Z Running 354 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___getitem___cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___radd___cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rsub___cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rxor___cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__chunk_cat_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__softmax_backward_data_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_abs_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acosh_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addbmm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addcmul_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_alias_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amax_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_aminmax_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_any_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_arange_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmin_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_partial_views_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_partial_views_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asinh_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan2_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan2_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atanh_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_baddbmm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_and_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_and_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_or_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_right_shift_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_right_shift_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_xor_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_block_diag_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bool_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_tensors_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_to_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cdist_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cdouble_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ceil_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cfloat_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cfloat_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_char_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cholesky_solve_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_min_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_column_stack_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_column_stack_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_complex_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_physical_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_physical_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_constant_pad_nd_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_contiguous_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_contiguous_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_contiguous_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_count_nonzero_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cov_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cross_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummax_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummin_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumprod_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumprod_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumsum_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumsum_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diff_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diff_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diff_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_digamma_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_digamma_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_floor_rounding_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_no_rounding_mode_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_no_rounding_mode_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_no_rounding_mode_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_double_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dsplit_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dstack_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_equal_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erf_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfinv_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp2_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expm1_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exponential_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftshift_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftshift_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfft_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftshift_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft2_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfftn_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfftn_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfftn_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfft2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fliplr_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_power_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmod_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmod_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gather_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gather_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gradient_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_grid_sampler_2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gt_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_histc_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hsplit_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_igamma_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_add_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_put_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_mean_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_int_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_int_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isfinite_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isinf_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isneginf_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isneginf_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isposinf_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_unary_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lgamma_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cross_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_diagonal_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_eigvals_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_inv_ex_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_inv_ex_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_lu_solve_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_matrix_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_matrix_rank_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_slogdet_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vander_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log2_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log2_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_normal_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logcumsumexp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_and_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_not_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_xor_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_xor_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logspace_tensor_overload_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lt_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lu_unpack_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mH_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mH_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mT_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumprod_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_fill_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_fill_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_normalize_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_prod_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_prod_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_scatter_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_matrix_exp_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_no_dim_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_no_dim_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_with_dim_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_median_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_median_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_binary_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_binary_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_with_dim_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_minimum_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_minimum_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_msort_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_multinomial_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nanmedian_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nanmedian_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nansum_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_narrow_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_native_dropout_backward_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_native_layer_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_native_layer_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ne_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_strided_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_full_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_avg_pool2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_max_pool3d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_avg_pool1d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_batch_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_bilinear_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_bilinear_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv3d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_dropout_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_embedding_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_group_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardsigmoid_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardswish_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardtanh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardtanh_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hinge_embedding_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_area_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_nearest_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_nearest_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_linear_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool3d_grad_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_mish_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_mse_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multi_margin_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_nll_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_poisson_nll_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softsign_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softsign_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_unfold_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_unfold_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_upsample_bilinear_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_upsample_nearest_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_static_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_fro_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_fro_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_nuc_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_normal_in_place_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_outer_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polar_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rad2deg_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rad2deg_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randint_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_real_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reciprocal_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_remainder_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_remainder_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_interleave_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_interleave_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_interleave_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reshape_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize__cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_conj_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_neg_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_roll_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsqrt_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsub_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_amin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_amin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_prod_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_sum_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_sum_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_searchsorted_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sgn_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sigmoid_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sign_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_hamming_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_kaiser_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_nuttall_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sin_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinh_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_scatter_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_j1_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_t_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_entr_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_entr_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_log_ndtr_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_i1_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_i1_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtr_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtri_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtri_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_u_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_w_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sqrt_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sqrt_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_multiple_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_std_mean_unbiased_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_std_unbiased_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sub_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_to_size_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_svd_lowrank_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_take_along_dim_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tan_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tan_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tanh_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tile_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_sparse_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_topk_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trace_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trapezoid_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trapz_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_triu_indices_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_true_divide_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_true_divide_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_var_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_as_complex_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vsplit_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vstack_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vstack_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_where_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_like_cuda_float16 2025-08-14T22:04:58.3206674Z 2025-08-14T22:05:02.3927956Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:05:02.3930290Z import pkg_resources 2025-08-14T22:05:02.7807705Z Running inductor/test_cpu_repro 1/1 ... [2025-08-14 22:05:02.780162] 2025-08-14T22:05:02.7808504Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:05:02.7810663Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cpu_repro.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:05:02.780564] 2025-08-14T22:05:39.0186543Z 2025-08-14T22:05:39.0187648Z inductor/test_torchinductor_opinfo 8/11 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_8.11_958e156557c89031_.log 2025-08-14T22:05:39.0322434Z Running 305 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___getitem___cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___getitem___cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___radd___cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rdiv___cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmul___cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___ror___cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rpow___cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rsub___cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__batch_norm_with_update_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__chunk_cat_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_abs_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acos_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acosh_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addcdiv_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addmm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_allclose_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_aminmax_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_aminmax_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_any_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_any_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_arange_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argwhere_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_partial_views_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asinh_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atanh_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_1d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_1d_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_2d_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_3d_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_left_shift_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_or_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_block_diag_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_block_diag_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bool_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bool_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_tensors_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bucketize_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bucketize_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_byte_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cartesian_prod_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cat_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cdouble_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cdouble_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_chunk_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_chunk_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clone_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clone_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_column_stack_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_constant_pad_nd_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_corrcoef_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cos_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cosh_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cross_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumprod_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumsum_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumsum_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumsum_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_deg2rad_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_deg2rad_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diag_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_scatter_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dist_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_floor_rounding_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_no_rounding_mode_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_no_rounding_mode_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_trunc_rounding_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_trunc_rounding_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dot_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_double_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_einsum_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_permuted_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_strided_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_equal_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erf_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfc_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfc_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfinv_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eye_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftn_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfftn_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfftn_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft2_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfft2_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfftn_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_divide_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_frac_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_frexp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_like_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gather_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ge_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_half_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hash_tensor_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_heaviside_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_i0_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_igamma_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_put_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amin_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_prod_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_select_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_inner_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isfinite_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isinf_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isneginf_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isneginf_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_le_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cholesky_ex_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cross_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_eig_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_eigh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_eigvalsh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_multi_dot_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_norm_subgradients_at_zero_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_pinv_singular_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_solve_ex_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vander_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_tensor_overload_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_tensor_overload_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log10_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log2_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_normal_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_with_dtype_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_and_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_xor_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mH_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumsum_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumsum_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_log_softmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_mean_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_scatter_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_softmin_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_matmul_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_matmul_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_no_dim_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mean_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_with_dim_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mode_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_movedim_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_msort_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mul_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nan_to_num_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nanmean_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_narrow_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_narrow_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_native_batch_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_neg_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_ones_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_ones_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_alpha_dropout_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_celu_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv1d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv3d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv_transpose2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cosine_similarity_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cosine_similarity_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_ctc_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_embedding_bag_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_fractional_max_pool2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_gaussian_nll_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_glu_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_grid_sample_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hinge_embedding_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_huber_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_instance_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_linear_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_nearest_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_local_response_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool1d_grad_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool2d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool3d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multi_head_attention_forward_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multi_head_attention_forward_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multi_margin_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_normalize_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_constant_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_poisson_nll_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_scaled_dot_product_attention_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_selu_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_soft_margin_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_unfold_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_static_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_normal_in_place_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_like_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ormqr_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_prod_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_put_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_quantile_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rad2deg_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ravel_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ravel_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize_as__cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize_as__cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_neg_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rot90_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_round_decimals_0_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsqrt_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scalar_tensor_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scalar_tensor_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_amin_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_mean_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_sum_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sgn_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sign_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sign_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_exponential_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_general_hamming_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_hamming_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signbit_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinc_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sort_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y0_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y0_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_v_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_entr_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_log_ndtr_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_log_ndtr_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_log_ndtr_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_i0_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtri_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_zeta_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_zeta_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_zeta_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_list_args_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_square_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_multiple_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_stack_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sub_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_take_along_dim_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_sparse_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_topk_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_true_divide_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unbind_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_consecutive_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_cuda_uint32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsafe_chunk_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsafe_split_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_var_mean_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_var_mean_unbiased_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_xlogy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_cuda_float16 2025-08-14T22:05:39.0449653Z 2025-08-14T22:05:43.1585249Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:05:43.1586561Z import pkg_resources 2025-08-14T22:05:43.5498747Z Running inductor/test_cuda_repro 1/1 ... [2025-08-14 22:05:43.549420] 2025-08-14T22:05:43.5499189Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:05:43.5502774Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cuda_repro.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:05:43.549850] 2025-08-14T22:05:51.3793956Z 2025-08-14T22:05:51.3794785Z inductor/test_cuda_repro 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cuda_repro_1.1_dd7de682da51f01b_.log 2025-08-14T22:05:51.3817021Z Running 78 items in this shard: test/inductor/test_cuda_repro.py::CudaReproTests::test_3d_tiling, test/inductor/test_cuda_repro.py::CudaReproTests::test_accuracy_issue1, test/inductor/test_cuda_repro.py::CudaReproTests::test_adaptive_avg_pool3d_issue_157248, test/inductor/test_cuda_repro.py::CudaReproTests::test_atomic_add_bfloat16, test/inductor/test_cuda_repro.py::CudaReproTests::test_atomic_add_bfloat16_config, test/inductor/test_cuda_repro.py::CudaReproTests::test_autotune_inplace_kernel, test/inductor/test_cuda_repro.py::CudaReproTests::test_backward_context, test/inductor/test_cuda_repro.py::CudaReproTests::test_bool_emulate_low_precision, test/inductor/test_cuda_repro.py::CudaReproTests::test_bucketize_dynamic_dense, test/inductor/test_cuda_repro.py::CudaReproTests::test_bucketize_epilogue, test/inductor/test_cuda_repro.py::CudaReproTests::test_cat_int8_one_kernel, test/inductor/test_cuda_repro.py::CudaReproTests::test_cpu_index, test/inductor/test_cuda_repro.py::CudaReproTests::test_deterministic_algorithms, test/inductor/test_cuda_repro.py::CudaReproTests::test_dont_inplace_disjoint_accesses, test/inductor/test_cuda_repro.py::CudaReproTests::test_dtype_factory_issue, test/inductor/test_cuda_repro.py::CudaReproTests::test_dynamic_persistent_reductions, test/inductor/test_cuda_repro.py::CudaReproTests::test_dynamic_shapes, test/inductor/test_cuda_repro.py::CudaReproTests::test_dynamic_to_static_cudagraphs, test/inductor/test_cuda_repro.py::CudaReproTests::test_effn_attn_bias_padding, test/inductor/test_cuda_repro.py::CudaReproTests::test_effn_attn_bias_padding_misaligned, test/inductor/test_cuda_repro.py::CudaReproTests::test_embedding_var_mean, test/inductor/test_cuda_repro.py::CudaReproTests::test_emulate_low_precision, test/inductor/test_cuda_repro.py::CudaReproTests::test_epilogue_fusion_with_view, test/inductor/test_cuda_repro.py::CudaReproTests::test_expanded_inputs_cudagraphs, test/inductor/test_cuda_repro.py::CudaReproTests::test_expanded_inputs_cudagraphs_no_size_asserts, test/inductor/test_cuda_repro.py::CudaReproTests::test_flash_attention_dynamic, test/inductor/test_cuda_repro.py::CudaReproTests::test_float64_constants, test/inductor/test_cuda_repro.py::CudaReproTests::test_float8_e8m0fnu, test/inductor/test_cuda_repro.py::CudaReproTests::test_full_copy, test/inductor/test_cuda_repro.py::CudaReproTests::test_index_add_fallback, test/inductor/test_cuda_repro.py::CudaReproTests::test_index_put_cudagraph, test/inductor/test_cuda_repro.py::CudaReproTests::test_index_put_inplace_cudagraph, test/inductor/test_cuda_repro.py::CudaReproTests::test_index_put_issue, test/inductor/test_cuda_repro.py::CudaReproTests::test_index_put_no_fallback_cudagraph, test/inductor/test_cuda_repro.py::CudaReproTests::test_indirect_indexing_dense_mask, test/inductor/test_cuda_repro.py::CudaReproTests::test_inductor_output_aliases_intermediate, test/inductor/test_cuda_repro.py::CudaReproTests::test_inplace_add_alpha_autotune, test/inductor/test_cuda_repro.py::CudaReproTests::test_inplace_buffer_autotune, test/inductor/test_cuda_repro.py::CudaReproTests::test_inplace_updates_cudagraphs, test/inductor/test_cuda_repro.py::CudaReproTests::test_input_channels_last, test/inductor/test_cuda_repro.py::CudaReproTests::test_int64_index_intermediate, test/inductor/test_cuda_repro.py::CudaReproTests::test_issue100806, test/inductor/test_cuda_repro.py::CudaReproTests::test_issue103461, test/inductor/test_cuda_repro.py::CudaReproTests::test_issue103481, test/inductor/test_cuda_repro.py::CudaReproTests::test_issue104759, test/inductor/test_cuda_repro.py::CudaReproTests::test_issue97695_1input, test/inductor/test_cuda_repro.py::CudaReproTests::test_issue97695_2input, test/inductor/test_cuda_repro.py::CudaReproTests::test_issue_103924, test/inductor/test_cuda_repro.py::CudaReproTests::test_libdevice_routing, test/inductor/test_cuda_repro.py::CudaReproTests::test_linear_cpu_input, test/inductor/test_cuda_repro.py::CudaReproTests::test_linear_with_zero_infeature_size, test/inductor/test_cuda_repro.py::CudaReproTests::test_lookup_seed_backward, test/inductor/test_cuda_repro.py::CudaReproTests::test_max_autotune_nograd, test/inductor/test_cuda_repro.py::CudaReproTests::test_memory_history_inductor, test/inductor/test_cuda_repro.py::CudaReproTests::test_multi_output_layout_fallback, test/inductor/test_cuda_repro.py::CudaReproTests::test_mutated_aligned_tensor, test/inductor/test_cuda_repro.py::CudaReproTests::test_negative_arange_dynamic_shapes, test/inductor/test_cuda_repro.py::CudaReproTests::test_no_device_idx_repro_cudagraphs, test/inductor/test_cuda_repro.py::CudaReproTests::test_non_commutative_scan_op, test/inductor/test_cuda_repro.py::CudaReproTests::test_non_contiguous_unaligned_input_indices, test/inductor/test_cuda_repro.py::CudaReproTests::test_not_initializing_wrong_device, test/inductor/test_cuda_repro.py::CudaReproTests::test_permute_fusion, test/inductor/test_cuda_repro.py::CudaReproTests::test_reflection_pad_loop_order, test/inductor/test_cuda_repro.py::CudaReproTests::test_repeated_masked_load, test/inductor/test_cuda_repro.py::CudaReproTests::test_scalar_triton_index, test/inductor/test_cuda_repro.py::CudaReproTests::test_scaled_dot_product_efficient_attention_backward, test/inductor/test_cuda_repro.py::CudaReproTests::test_scatter_index_not_wrapped, test/inductor/test_cuda_repro.py::CudaReproTests::test_selecsls42b_misaligned_address, test/inductor/test_cuda_repro.py::CudaReproTests::test_simplify_dims, test/inductor/test_cuda_repro.py::CudaReproTests::test_sort_stride_issue, test/inductor/test_cuda_repro.py::CudaReproTests::test_sorted_masks, test/inductor/test_cuda_repro.py::CudaReproTests::test_split_reduction_channels_last, test/inductor/test_cuda_repro.py::CudaReproTests::test_split_reduction_transposed, test/inductor/test_cuda_repro.py::CudaReproTests::test_triton_interpret, test/inductor/test_cuda_repro.py::CudaReproTests::test_uint_view_copy, test/inductor/test_cuda_repro.py::CudaReproTests::test_unspec_inputs_interop, test/inductor/test_cuda_repro.py::CudaReproTests::test_unused_cpu_input_cudagraphs, test/inductor/test_cuda_repro.py::CudaReproTests::test_xlnet_lm_stride_repro 2025-08-14T22:05:51.3838447Z 2025-08-14T22:05:55.5195699Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:05:55.5197021Z import pkg_resources 2025-08-14T22:05:55.9073406Z Running inductor/test_kernel_benchmark 1/1 ... [2025-08-14 22:05:55.906827] 2025-08-14T22:05:55.9074002Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:05:55.9076231Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_kernel_benchmark.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:05:55.907220] 2025-08-14T22:06:03.2361097Z 2025-08-14T22:06:03.2362049Z inductor/test_kernel_benchmark 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_kernel_benchmark_1.1_40eaa2f5eb381aa6_.log 2025-08-14T22:06:03.2369490Z Running 18 items in this shard: test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_fused_layernorm_bandwidth_computation, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_matmul_bandwidth_computation, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_matmul_triton_kernel_benchmark, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_mm_slice_add_bandwidth_computation, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_mm_slice_add_bandwidth_computation_2, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_mm_triton_kernel_benchmark, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_pw_kernel_benchmark, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_reduction_bandwidth_computation, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_remove_inductor_deps, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_remove_inductor_deps_multiple_kernels, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_remove_inductor_deps_scalar, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_remove_inductor_deps_templates, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_slice_add_bandwidth_computation, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_slice_add_cat_bandwidth_computation, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_slice_mm_bandwidth_computation, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_split_scan, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_star_dep, test/inductor/test_kernel_benchmark.py::TestKernelBenchmark::test_unused_input_bandwidth_computation 2025-08-14T22:06:03.2376013Z 2025-08-14T22:06:07.3482310Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:06:07.3483636Z import pkg_resources 2025-08-14T22:06:07.7314994Z Running inductor/test_cudagraph_trees 1/1 ... [2025-08-14 22:06:07.730955] 2025-08-14T22:06:07.7315617Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:06:07.7317446Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cudagraph_trees.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:06:07.731357] 2025-08-14T22:06:15.8658135Z 2025-08-14T22:06:15.8660143Z inductor/test_cudagraph_trees 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cudagraph_trees_1.1_d7f34b4635e93142_.log 2025-08-14T22:06:15.8758837Z Running 158 items in this shard: test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_accumulate_grad, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_accumulate_multiple_recordings, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_alias_of_parameter, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_aliased_output_checkpoint, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_aliased_static_parameter, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_aliased_storage_single_weakref, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_aliasing_static_ref, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_amp_cache_disabled, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_backward_gets_cached_cudagraphs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_cache_hit_forward_miss_backward, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_cached_boxed_forward_device_index, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_cached_forward_backward, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_checkpoint_shared_output_storage_deallocation, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_checkpointing_resets_persistent_refs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_cleanup, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_compiled_autograd_static_input_params, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_constant_output, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_conv_benchmark, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_cpp_wrapper, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_cudagraph_capture_sizes, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_cudagraph_capture_sizes1, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_cudagraph_capture_sizes2, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_dynamic_backward, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_dynamic_warmup, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_empty_cpu_tensor, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_empty_storage, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_end_recording_early, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_error_on_dealloc_use, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_error_on_dealloc_use2, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_execution_into_recording, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_expanded_inputs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_fallback_to_eager_if_recompiling_too_many_times, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_fallback_to_eager_if_recompiling_too_many_times_due_to_cudagraph_managed_tensor, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_fallback_to_eager_if_recompiling_too_many_times_warn_only_once, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_forward_backward, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_forward_backward_not_called_backend_cudagraphs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_forward_backward_not_called_backend_inductor, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_forward_generation, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_forward_with_skipped_cudagraphed_backward, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_frozen_fn, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_function_compiled_multiple_times, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_buffer_reuse, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_condition_op, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_cpu_only, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_cpu_op_and_dynamic_shapes, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar1, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar2, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar3, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar4, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar_device_put, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar_multiple, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar_mutation, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_cpu_tensor_symints, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_custom_op, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_custom_op_dynamoc_shapes, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_custom_op_mutation, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_custom_op_no_split, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_dynamic_scalar_inputs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_dynamic_shapes, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_foreach_op, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_forward_backward, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_forward_backward_not_called, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_forward_with_skipped_cudagraphed_backward, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_fused_scheduler_node, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_gc, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_item, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_log_message, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_multiple_devices_msg, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_reduce_overhead_mode_effectiveness, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_reorder_cpu_and_gpu, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_reorder_cpu_and_gpu_interleave, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_reorder_custom_op_with_no_dependency, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_reorder_custom_op_with_no_dependency1, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_simple, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_symint, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_symint_cat_backward, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_symint_from_mutation_index, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_symint_from_nested_indirect_indexing, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_unbacked_symint, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_graph_partition_unbacked_symint_multi_output_layout, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_incompatible_cudagraph_ops_item, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_incompatible_cudagraph_ops_nonzero, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_incompatible_cudagraph_ops_nonzero_backend, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_incompatible_cudagraph_ops_nonzero_graph_breaks, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_index_put, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_live_outputs_multiple_graphs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_manager_per_device, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mark_step, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_meta_tensor, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_multi_dispatch_child_node, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_multi_dispatch_custom_module, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_multi_dispatch_custom_module_buffer, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_multi_dispatch_parent_node, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_multi_dispatch_single_compile_builtin_module, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_multi_dispatch_single_compile_builtin_module_buffers, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_multi_dispatch_single_compile_param_inputs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_multinomial, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_multiple_devices_msg_backend_cudagraphs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_multiple_devices_msg_backend_inductor, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_multiple_insert_removal_caching, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensor_warn_backend_cudagraphs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensor_warn_backend_inductor, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensor_warn_only_once_backend_cudagraphs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensor_warn_only_once_backend_inductor, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensors_backend_cudagraphs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensors_backend_inductor, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensors_config_backend_cudagraphs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensors_config_backend_inductor, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mutation_on_inp_backend_cudagraphs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mutation_on_inp_backend_inductor, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_mutation_reinplaced, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_no_rerecord_with_mark_static_address, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_not_fallback_to_eager_if_have_not_recompiling_too_many_times, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_output_alias, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_peristed_output_livenes, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_remove_hooks_on_cached_tensors, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_rerecord_if_static_input_address_changed, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_rng_non_trees, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_rng_trees, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_run_simple, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_separate_recordings, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_side_stream_memory_allocation, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_single_stream_use, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_skip_cpp_wrapper, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_skip_cudagraph_unsafe_ops, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_skip_if_dynamic_shape_limit_reached1, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_skip_if_dynamic_shape_limit_reached2, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_skip_symbolic, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_sparsity, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_static_inputs_address_mutation_log, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_storage_access_error, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_tensor_constant_mutation, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_tensor_dies_between_checkpoint, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_tensor_no_longer_in_pool, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_unaligned_static_input_no_cudagraphs, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_unaligned_static_input_non_trees, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_unaligned_static_input_trees, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_unaligned_static_parameter, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_unstable_ptr, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_warmup_stream_sync, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_warn_on_pending_backward, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_warn_once_if_dynamic_shape_limit_reached, test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_workspace_allocation_error, test/inductor/test_cudagraph_trees.py::TestSAC::test_cpu_and_cuda_rng, test/inductor/test_cudagraph_trees.py::TestSAC::test_cudagraph_uneven_forward_backward, test/inductor/test_cudagraph_trees.py::TestSAC::test_cudagraphs_aot_eager_compat_equal, test/inductor/test_cudagraph_trees.py::TestSAC::test_cudagraphs_aot_eager_compat_equal_device_one, test/inductor/test_cudagraph_trees.py::TestSAC::test_graph_partition_cudagraphs_aot_eager_compat_equal, test/inductor/test_cudagraph_trees.py::TestSAC::test_multi_device, test/inductor/test_cudagraph_trees.py::TestSAC::test_retain_graph, test/inductor/test_cudagraph_trees.py::TestSAC::test_simple, test/inductor/test_cudagraph_trees.py::TestSAC::test_uneven_forward_backward_order0, test/inductor/test_cudagraph_trees.py::TestSAC::test_uneven_forward_backward_order1, test/inductor/test_cudagraph_trees.py::TestSAC::test_uneven_forward_backward_order2, test/inductor/test_cudagraph_trees.py::TestSAC::test_uneven_forward_backward_order3, test/inductor/test_cudagraph_trees.py::TestSAC::test_uneven_forward_backward_order4, test/inductor/test_cudagraph_trees.py::TestSAC::test_uneven_forward_backward_order5 2025-08-14T22:06:15.8853888Z 2025-08-14T22:06:20.0475371Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:06:20.0477620Z import pkg_resources 2025-08-14T22:06:20.5015993Z Running dynamo/test_dynamic_shapes 1/2 ... [2025-08-14 22:06:20.500887] 2025-08-14T22:06:20.5016655Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:06:20.5018143Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_dynamic_shapes.py', '-m', 'not serial', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:06:20.501376] 2025-08-14T22:06:25.2198854Z 2025-08-14T22:06:25.2199779Z inductor/test_cpu_repro 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cpu_repro_1.1_b2551ab843b6cf3f_.log 2025-08-14T22:06:25.2758875Z Running 739 items in this shard: test/inductor/test_cpu_repro.py::CPUReproTests::test_ModularIndexing_range_issue_103133, test/inductor/test_cpu_repro.py::CPUReproTests::test__adaptive_avg_pool2d, test/inductor/test_cpu_repro.py::CPUReproTests::test_acosh_with_negative_large_input, test/inductor/test_cpu_repro.py::CPUReproTests::test_add_layernorm, test/inductor/test_cpu_repro.py::CPUReproTests::test_argmax_argmin_with_nan_value, test/inductor/test_cpu_repro.py::CPUReproTests::test_argmin, test/inductor/test_cpu_repro.py::CPUReproTests::test_asinh_with_corner_inputs, test/inductor/test_cpu_repro.py::CPUReproTests::test_aten_normal_dtype, test/inductor/test_cpu_repro.py::CPUReproTests::test_atomic_add_lowp_fp, test/inductor/test_cpu_repro.py::CPUReproTests::test_attention_size_mismatch, test/inductor/test_cpu_repro.py::CPUReproTests::test_auto_simd, test/inductor/test_cpu_repro.py::CPUReproTests::test_auto_zvec_vsx_simd, test/inductor/test_cpu_repro.py::CPUReproTests::test_avx2_bool_constant_pad_nd, test/inductor/test_cpu_repro.py::CPUReproTests::test_bf16_zeros, test/inductor/test_cpu_repro.py::CPUReproTests::test_bitwise_logical_op_bool, test/inductor/test_cpu_repro.py::CPUReproTests::test_bitwise_right_shift, test/inductor/test_cpu_repro.py::CPUReproTests::test_bitwise_shift_corner_inputs, test/inductor/test_cpu_repro.py::CPUReproTests::test_bool_max, test/inductor/test_cpu_repro.py::CPUReproTests::test_bool_reduction_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_broadcast_mul_lowp_fp, test/inductor/test_cpu_repro.py::CPUReproTests::test_broadcast_scalar_cpp_tile_2d_kernel, test/inductor/test_cpu_repro.py::CPUReproTests::test_cat_mul, test/inductor/test_cpu_repro.py::CPUReproTests::test_channel_shuffle_cl_output, test/inductor/test_cpu_repro.py::CPUReproTests::test_channels_last_view_as_complex, test/inductor/test_cpu_repro.py::CPUReproTests::test_complex_memory_overlap, test/inductor/test_cpu_repro.py::CPUReproTests::test_concat_inner_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_consistent_remove_buffers, test/inductor/test_cpu_repro.py::CPUReproTests::test_constant_bool_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_constant_store, test/inductor/test_cpu_repro.py::CPUReproTests::test_conv2d_autocast, test/inductor/test_cpu_repro.py::CPUReproTests::test_conv2d_bn_mixed_dtype, test/inductor/test_cpu_repro.py::CPUReproTests::test_conv2d_packed, test/inductor/test_cpu_repro.py::CPUReproTests::test_conv_in_channel_1_dynamic_shapes, test/inductor/test_cpu_repro.py::CPUReproTests::test_conv_stride_constraints, test/inductor/test_cpu_repro.py::CPUReproTests::test_conv_transpose2d_has_output_size_input, test/inductor/test_cpu_repro.py::CPUReproTests::test_conv_transpose2d_packed_cpu, test/inductor/test_cpu_repro.py::CPUReproTests::test_conv_used_from_multiple_places, test/inductor/test_cpu_repro.py::CPUReproTests::test_convert_double_to_fp32_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_convert_fp32_int64_oob_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_convert_fp32_to_double_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_convert_fp32_to_int64_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_convert_int32_to_int64_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_convert_int64_to_fp32_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_convert_int64_to_int32_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_convert_int8_to_half_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_cpp_kernel_profile, test/inductor/test_cpu_repro.py::CPUReproTests::test_cpu_vec_cosim, test/inductor/test_cpu_repro.py::CPUReproTests::test_decomposed_dequant_relu_quant_int8, test/inductor/test_cpu_repro.py::CPUReproTests::test_decomposed_dequant_relu_quant_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_decomposed_fake_quant_per_channel, test/inductor/test_cpu_repro.py::CPUReproTests::test_dequant_maxpool2d_lowering_int8, test/inductor/test_cpu_repro.py::CPUReproTests::test_dequant_maxpool2d_lowering_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_dequant_quant_lowering_fp8_e4m3, test/inductor/test_cpu_repro.py::CPUReproTests::test_dequant_quant_lowering_fp8_e5m2, test/inductor/test_cpu_repro.py::CPUReproTests::test_dequant_quant_lowering_int8, test/inductor/test_cpu_repro.py::CPUReproTests::test_dequant_quant_lowering_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_dequant_relu_quant_dequant_relu_quant_lowering_int8, test/inductor/test_cpu_repro.py::CPUReproTests::test_dequant_relu_quant_dequant_relu_quant_lowering_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_disabled_amp_is_inference_False, test/inductor/test_cpu_repro.py::CPUReproTests::test_disabled_amp_is_inference_True, test/inductor/test_cpu_repro.py::CPUReproTests::test_do_not_insert_to_dtype_for_memory_copy_only_kernel, test/inductor/test_cpu_repro.py::CPUReproTests::test_double_pointwise_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_double_reduction_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_dropout, test/inductor/test_cpu_repro.py::CPUReproTests::test_eliminate_meaningless_copy, test/inductor/test_cpu_repro.py::CPUReproTests::test_embedding_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_embedding_vec_bf16, test/inductor/test_cpu_repro.py::CPUReproTests::test_expr_vec_non_contiguous, test/inductor/test_cpu_repro.py::CPUReproTests::test_float32_to_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_for_loop_collapsed, test/inductor/test_cpu_repro.py::CPUReproTests::test_fp32_load_with_to_lowp_fp, test/inductor/test_cpu_repro.py::CPUReproTests::test_fp8_cast_bfloat16_shape_15,3,13, test/inductor/test_cpu_repro.py::CPUReproTests::test_fp8_cast_bfloat16_shape_4,2048,4096, test/inductor/test_cpu_repro.py::CPUReproTests::test_fp8_cast_float16_shape_15,3,13, test/inductor/test_cpu_repro.py::CPUReproTests::test_fp8_cast_float16_shape_4,2048,4096, test/inductor/test_cpu_repro.py::CPUReproTests::test_fp8_cast_float32_shape_15,3,13, test/inductor/test_cpu_repro.py::CPUReproTests::test_fp8_cast_float32_shape_4,2048,4096, test/inductor/test_cpu_repro.py::CPUReproTests::test_fractional_max_pool2d_3d_input, test/inductor/test_cpu_repro.py::CPUReproTests::test_frexp, test/inductor/test_cpu_repro.py::CPUReproTests::test_full_bits_lowp, test/inductor/test_cpu_repro.py::CPUReproTests::test_full_boolean_dynamic_shape, test/inductor/test_cpu_repro.py::CPUReproTests::test_fused_attention_conv, test/inductor/test_cpu_repro.py::CPUReproTests::test_fused_node, test/inductor/test_cpu_repro.py::CPUReproTests::test_group_norm_large_input, test/inductor/test_cpu_repro.py::CPUReproTests::test_group_norm_large_size, test/inductor/test_cpu_repro.py::CPUReproTests::test_group_norm_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_highp_to_lowp_cse_var_cache_with_store, test/inductor/test_cpu_repro.py::CPUReproTests::test_horizontal_fusion, test/inductor/test_cpu_repro.py::CPUReproTests::test_in_out_buffer, test/inductor/test_cpu_repro.py::CPUReproTests::test_index_add, test/inductor/test_cpu_repro.py::CPUReproTests::test_index_propagation_issue_102065, test/inductor/test_cpu_repro.py::CPUReproTests::test_index_put, test/inductor/test_cpu_repro.py::CPUReproTests::test_index_put2, test/inductor/test_cpu_repro.py::CPUReproTests::test_inplace_add_alpha, test/inductor/test_cpu_repro.py::CPUReproTests::test_inplace_squeeze_needed, test/inductor/test_cpu_repro.py::CPUReproTests::test_insert_to_dtype_count, test/inductor/test_cpu_repro.py::CPUReproTests::test_int32_pointwise_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_int32_reduction_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_int64_pointwise_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_int64_reduction_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_int_div, test/inductor/test_cpu_repro.py::CPUReproTests::test_int_div_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_invalid_dropout_args, test/inductor/test_cpu_repro.py::CPUReproTests::test_invalid_index_of_empty_tensor, test/inductor/test_cpu_repro.py::CPUReproTests::test_ir_node_str, test/inductor/test_cpu_repro.py::CPUReproTests::test_issue122380, test/inductor/test_cpu_repro.py::CPUReproTests::test_issue_148058, test/inductor/test_cpu_repro.py::CPUReproTests::test_large_mean, test/inductor/test_cpu_repro.py::CPUReproTests::test_linear_buffer_reuse, test/inductor/test_cpu_repro.py::CPUReproTests::test_linear_float64, test/inductor/test_cpu_repro.py::CPUReproTests::test_linear_packed, test/inductor/test_cpu_repro.py::CPUReproTests::test_linear_used_from_multiple_places, test/inductor/test_cpu_repro.py::CPUReproTests::test_linear_with_no_default_contiguous_input, test/inductor/test_cpu_repro.py::CPUReproTests::test_linear_with_reshape, test/inductor/test_cpu_repro.py::CPUReproTests::test_load_half, test/inductor/test_cpu_repro.py::CPUReproTests::test_load_inf_bf16, test/inductor/test_cpu_repro.py::CPUReproTests::test_load_same_bool_tensor_twice, test/inductor/test_cpu_repro.py::CPUReproTests::test_local_buffer_in_outer_loop_fusion, test/inductor/test_cpu_repro.py::CPUReproTests::test_local_buffer_with_line_reuse, test/inductor/test_cpu_repro.py::CPUReproTests::test_logical_op_store_to_lowp_data_dtype, test/inductor/test_cpu_repro.py::CPUReproTests::test_low_fp_index_expr_issue_147279, test/inductor/test_cpu_repro.py::CPUReproTests::test_lowp_fp_neg_abs, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_change_input_sizes_cpu_unbatched_False_input_size_2_hidden_size_5_num_layers_3_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_2_seq_len_3, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_False_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_1_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_1_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_False_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_False_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_False_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_False_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_1_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_1, test/inductor/test_cpu_repro.py::CPUReproTests::test_lstm_packed_unbatched_True_input_size_7_hidden_size_7_num_layers_7_bidirectional_True_bias_True_empty_state_True_batch_first_True_batch_size_7_seq_len_7, test/inductor/test_cpu_repro.py::CPUReproTests::test_masked_fill_softmax, test/inductor/test_cpu_repro.py::CPUReproTests::test_masked_fill_with_inf_or_nan_value, test/inductor/test_cpu_repro.py::CPUReproTests::test_masked_load_int64_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_max_reduction_lowp_fp, test/inductor/test_cpu_repro.py::CPUReproTests::test_maxpool2d_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_maxpool2d_with_pre_loop_collapse_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_memory_copy_with_fusion, test/inductor/test_cpu_repro.py::CPUReproTests::test_meta_device, test/inductor/test_cpu_repro.py::CPUReproTests::test_mkl_linear, test/inductor/test_cpu_repro.py::CPUReproTests::test_module_buffer_mutation, test/inductor/test_cpu_repro.py::CPUReproTests::test_multihead_attention_cpu, test/inductor/test_cpu_repro.py::CPUReproTests::test_new_vec_op_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_nn_fold, test/inductor/test_cpu_repro.py::CPUReproTests::test_nn_param_assign, test/inductor/test_cpu_repro.py::CPUReproTests::test_nn_param_assign_wrapped, test/inductor/test_cpu_repro.py::CPUReproTests::test_no_op_squeeze, test/inductor/test_cpu_repro.py::CPUReproTests::test_no_redundant_to_dtypes_between_fused_scheduler_node, test/inductor/test_cpu_repro.py::CPUReproTests::test_non_contiguous_index_with_constant_stride, test/inductor/test_cpu_repro.py::CPUReproTests::test_non_contiguous_load_buf_quant_int8, test/inductor/test_cpu_repro.py::CPUReproTests::test_non_contiguous_load_buf_quant_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_non_contiguous_reduction_store, test/inductor/test_cpu_repro.py::CPUReproTests::test_ops_masked_with_bool_input, test/inductor/test_cpu_repro.py::CPUReproTests::test_outer_loop_fusion, test/inductor/test_cpu_repro.py::CPUReproTests::test_outer_loop_fusion_buffer_remove, test/inductor/test_cpu_repro.py::CPUReproTests::test_pack_padded_sequence_lstm, test/inductor/test_cpu_repro.py::CPUReproTests::test_pad_with_nan_value, test/inductor/test_cpu_repro.py::CPUReproTests::test_parallel_num_threads, test/inductor/test_cpu_repro.py::CPUReproTests::test_parallel_reduction_vectorization, test/inductor/test_cpu_repro.py::CPUReproTests::test_per_channel_fake_quant_int8, test/inductor/test_cpu_repro.py::CPUReproTests::test_per_channel_fake_quant_int8_bf16_input, test/inductor/test_cpu_repro.py::CPUReproTests::test_per_channel_fake_quant_module_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_per_channel_fake_quant_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_per_channel_fake_quant_uint8_bf16_input, test/inductor/test_cpu_repro.py::CPUReproTests::test_per_tensor_fake_quant_int8, test/inductor/test_cpu_repro.py::CPUReproTests::test_per_tensor_fake_quant_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_pow_cos, test/inductor/test_cpu_repro.py::CPUReproTests::test_randint_symint_input, test/inductor/test_cpu_repro.py::CPUReproTests::test_reduce_with_masked, test/inductor/test_cpu_repro.py::CPUReproTests::test_reduction_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_reduction_float_to_int64, test/inductor/test_cpu_repro.py::CPUReproTests::test_reduction_with_dynamic_threads, test/inductor/test_cpu_repro.py::CPUReproTests::test_redundant_to_node_elimination_lowp_fp, test/inductor/test_cpu_repro.py::CPUReproTests::test_relu_with_inf_value, test/inductor/test_cpu_repro.py::CPUReproTests::test_repeat_interleave, test/inductor/test_cpu_repro.py::CPUReproTests::test_repeated_exp, test/inductor/test_cpu_repro.py::CPUReproTests::test_require_stride_order_non_owning, test/inductor/test_cpu_repro.py::CPUReproTests::test_scalar_mul_bfloat16, test/inductor/test_cpu_repro.py::CPUReproTests::test_scalar_sign_with_min, test/inductor/test_cpu_repro.py::CPUReproTests::test_scatter_using_atomic_add, test/inductor/test_cpu_repro.py::CPUReproTests::test_scatter_using_atomic_add_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_select_tiliing_with_index_expr, test/inductor/test_cpu_repro.py::CPUReproTests::test_set_source_Tensor, test/inductor/test_cpu_repro.py::CPUReproTests::test_share_local_buffers_in_outer_loop_fusion, test/inductor/test_cpu_repro.py::CPUReproTests::test_sigmoid_with_reduction, test/inductor/test_cpu_repro.py::CPUReproTests::test_sign_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_skip_cpp_codegen, test/inductor/test_cpu_repro.py::CPUReproTests::test_slice_scatter_default_end_value, test/inductor/test_cpu_repro.py::CPUReproTests::test_slice_scatter_issue122291, test/inductor/test_cpu_repro.py::CPUReproTests::test_store_reduction, test/inductor/test_cpu_repro.py::CPUReproTests::test_symbolic_shape_scalar_value_reduction, test/inductor/test_cpu_repro.py::CPUReproTests::test_tanh_atan2, test/inductor/test_cpu_repro.py::CPUReproTests::test_tanh_atan2_use_decompose_tanh, test/inductor/test_cpu_repro.py::CPUReproTests::test_tile2d_load_decomposed_dequant_add_relu_quant_int8, test/inductor/test_cpu_repro.py::CPUReproTests::test_tile2d_load_decomposed_dequant_add_relu_quant_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_tile2d_store_channel_shuffle_cl_quant_output_int8, test/inductor/test_cpu_repro.py::CPUReproTests::test_tile2d_store_channel_shuffle_cl_quant_output_uint8, test/inductor/test_cpu_repro.py::CPUReproTests::test_timed_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_to_channels_last_lowp_fp, test/inductor/test_cpu_repro.py::CPUReproTests::test_to_dtype_bool_float, test/inductor/test_cpu_repro.py::CPUReproTests::test_to_dtype_float_bool, test/inductor/test_cpu_repro.py::CPUReproTests::test_to_uint8_rounding_method, test/inductor/test_cpu_repro.py::CPUReproTests::test_torch_logit, test/inductor/test_cpu_repro.py::CPUReproTests::test_transpose_copy, test/inductor/test_cpu_repro.py::CPUReproTests::test_transpose_mxn_16_16_bf16_fp16, test/inductor/test_cpu_repro.py::CPUReproTests::test_transpose_mxn_32_32_bf16_fp16, test/inductor/test_cpu_repro.py::CPUReproTests::test_transpose_non_contiguous, test/inductor/test_cpu_repro.py::CPUReproTests::test_transpose_sum2d_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_transpose_sum_outer, test/inductor/test_cpu_repro.py::CPUReproTests::test_transpose_vertical_sum_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_transpose_with_norm, test/inductor/test_cpu_repro.py::CPUReproTests::test_two_local_buffers_in_outer_loop_fusion, test/inductor/test_cpu_repro.py::CPUReproTests::test_two_local_buffers_in_outer_loop_fusion_case2, test/inductor/test_cpu_repro.py::CPUReproTests::test_uint32_pointwise_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_uint32_reduction_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_uint64_pointwise_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_uint64_reduction_vec, test/inductor/test_cpu_repro.py::CPUReproTests::test_uint8_add, test/inductor/test_cpu_repro.py::CPUReproTests::test_uint8_sub, test/inductor/test_cpu_repro.py::CPUReproTests::test_unrolled_bool_prod_vectorized, test/inductor/test_cpu_repro.py::CPUReproTests::test_unsupported_conv_transpose, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_bitwise, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_compare_op_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_contiguous_ModularIndexing, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_cpu_only_for_all_available_isa, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_dynamic_shapes, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_indirect_load_cse_cache, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_kernel_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_logical, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_randn, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_remainder, test/inductor/test_cpu_repro.py::CPUReproTests::test_vec_transpose_lowp_fp, test/inductor/test_cpu_repro.py::CPUReproTests::test_vector_norm_compile, test/inductor/test_cpu_repro.py::CPUReproTests::test_vertical_sum_cpu_only, test/inductor/test_cpu_repro.py::CPUReproTests::test_view_dtype 2025-08-14T22:06:25.3163447Z 2025-08-14T22:06:29.3290257Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:06:29.3291629Z import pkg_resources 2025-08-14T22:06:29.7129324Z Running dynamo/test_dynamic_shapes 2/2 ... [2025-08-14 22:06:29.712390] 2025-08-14T22:06:29.7129790Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:06:29.7132895Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_dynamic_shapes.py', '-m', 'not serial', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:06:29.712881] 2025-08-14T22:13:37.3415555Z 2025-08-14T22:13:37.3418601Z dynamo/test_dynamic_shapes 1/2 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_dynamic_shapes_1.2_6173a3b68ef2ec1a_.log 2025-08-14T22:13:37.3845742Z Running 962 items in this shard: test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_autocast_arguments_binding_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_autocast_cpu_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_autocast_cpu_graph_break_2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_autocast_cpu_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_autocast_cpu_graph_break_inner_fn_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_autocast_decorator_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_autocast_device_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_autograd_profiler_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_autograd_profiler_enabled_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_context_wrapping_grad_mode_nested_function_decorator_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_cuda_amp_autocast_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_cuda_device_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_cuda_event_across_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_cuda_event_created_outside_of_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_cuda_event_method_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_cuda_stream_compared_with_stream_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_cuda_stream_context_manager1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_cuda_stream_method_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_disable_saved_tensors_hooks_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_disable_saved_tensors_hooks_prev_disabled_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_generic_context_manager_customized_ctx_manager_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_generic_context_manager_with_graph_break_CustomizedCtxManager_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_generic_ctx_manager_with_graph_break_customized_ctx_manager_with_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_graph_break_inlining_autocast_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_graph_break_inlining_grad_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_inactive_context_graph_break_local_nullctx_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_inactive_context_graph_break_stack2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_nested_generic_context_manager_with_graph_break_CustomizedCtxManager_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_nested_grad_mode_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_no_grad_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_return_context_manager_with_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_sdpa_kernel_ctx_manager1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_sdpa_kernel_ctx_manager2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_sdpa_kernel_ctx_manager3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_T_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_add_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_addcmul__dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_are_functorch_transforms_active_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_attrgetter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_broadcast_foreach_pow_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_call_dict2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_call_dict3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_callable_class_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_callable_lambda_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_callable_list_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_class_dict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_cls_eq_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_cls_hasattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_cls_is_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_complex_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_constant1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_constant2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_constant3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_constant4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_constant_set_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_context_wrapping_nested_functions_no_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_cublas_allow_tf32_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_custom_dict_kwargs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_default_dict_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_default_dict_list_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_default_dict_tuple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_defaultdict_setdefault2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_defaultdict_setdefault3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_del_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_device_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_fromkeys_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_hasattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_key_set1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_key_set2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_key_set3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_kwargs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_mutable_map_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_ops_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_param_keys_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_setdefault1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_setdefault2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_setdefault3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_values_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_distributed_is_available_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_distributed_is_initialized_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dtype_compare_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_enumerate_custom_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_filter_fallback_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_filter_graph_break_reconstruct_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_filter_reconstruct_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_filter_with_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_foreach_lerp__dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_fstrings1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_fstrings2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_functools_cache_guard_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_generic_namedtuple_subclass_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_generic_namedtuple_user_methods_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_get_autocast_gpu_dtype_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_get_default_dtype_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_get_device_properties_tensor_device_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_globalfn_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_import1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_indexed_range_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_indirect3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_inline_jit__unwrap_optional_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_inline_script_if_tracing_fn_with_default_args_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_inline_softmax_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_is_any_autocast_enabled_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_is_checkpoint_valid_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_is_complex_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_is_contiguous_memory_format_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_is_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_is_floating_point_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_is_fx_tracing_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_is_inference_recompilation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_is_not_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_isinstance_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_islice_chain_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_itertools_chain_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_itertools_chain_from_iterable_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_itertools_compress_tensors_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_itertools_permutations_basic_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_itertools_permutations_various_iterators_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_itertools_product_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_len_constant_misc_iterables_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_clear_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_convert_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_expand_lhs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_reversed_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_setitem_slice_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_slice_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_listarg2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_listarg4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_load_global_bool_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_lru_cache_warning_issued_during_tracing_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_mT_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_call_function_ex_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_dict_fromkeys_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_list_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_list_slice_assign_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_max_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_reduce_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_set_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_sorted_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_str_join_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_tuple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_unpack_twice_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_math_radians_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_mean_sum_np_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_methodcall1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_methodcall3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_methodcaller_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_min_max_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_namedtuple_defaults_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_namedtuple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_namedtuple_hasattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_namedtuple_subclass_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_namedtuple_user_methods_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_ndarray_builtin_functions_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_ndarray_method_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_ndarray_transpose_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_ndim_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_no_recompile_inner_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_no_recompile_inner_lambda_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_not_list_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_np_constant_collections_as_input_int_or_float_float_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_np_constant_collections_guards_float_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_np_constant_collections_guards_int_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_np_iinfo_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_number_method_method_as_integer_ratio_num_type3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_number_method_method_conjugate_num_type2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_number_method_method_conjugate_num_type4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_number_method_method_is_integer_num_type6_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_numpy_attributes_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_numpy_dtype_argument_to_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_numpy_dtype_call_in_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_numpy_fft_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_numpy_linalg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_numpy_meshgrid_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_obj_eq_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partial_across_graph_break_uninvoked_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___annotations___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___class___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___code___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___dir___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___eq___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___ge___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___get___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___getattribute___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___globals___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___gt___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___init___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___init_subclass___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___kwdefaults___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___lt___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___module___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___qualname___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___reduce___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___reduce_ex___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___sizeof___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr_args_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_lambda_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_recompilation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_torch_op_arg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_torch_op_kwarg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_udf_arg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_udf_kwarg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_udf_kwarg_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_pop_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_pos_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_pow_int_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_promote_types_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_rand_inlined_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_range1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_range_length_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_range_with_slice_index_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_reduce_with_initial_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_reduce_with_none_initial_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_reduce_with_single_with_initial_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_return_dict2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_return_multiple_numpy_ndarray_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_return_tuple2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_returning_recursive_func_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_set_add_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_shape2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_slice3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_slice4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_slice5_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_sorted_const_key_non_const_items_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_sourceless_build_method_type_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_sum_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_sum_shortcut_with_start_arg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_sum_with_start_arg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tensor_dim_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tensor_len_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tensor_new_with_size_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tensor_size_indexed_by_symint_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tensor_type4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tensor_type5_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_torch_from_numpy_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_torch_get_device_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_truth_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tuple2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tuple_contains_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tuple_sorted_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_unary_fold_op_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_unpack1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_unpack2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_unpack3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_unpack_ex1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_unpack_ex3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_viamethod_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_312_binary_slice_with_graph_break1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_RAISE_VARARGS_0_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_T_tensor_attribute_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_add_to_set_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_anomaly_aot_autograd_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_aot_autograd_propagate_unbacked_symints_shape_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_arange_length_with_float32_dtype_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_argwhere_with_dynamic_shapes_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_assert_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_assert_size_stride_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_backend_match_guard_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_backend_match_guard_multi_threads_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_boolarg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_builder_for_class_with_metaclass_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_builtin_abs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_builtin_bool_on_symint_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_builtin_subclasses_as_method_on_var_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_call_parent_non_class_methods_from_child_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_callpacked_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cat_unbacked_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_catch_watchings2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cell_output2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_class_duner_flags_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_class_has_instancecheck_method_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_clone_sparse_input_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_closure_out_of_scope_cell_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_closure_out_of_scope_cell_with_mutation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_closure_with_mutation_and_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_closure_write_across_functions_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_compare_shapes_eq_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_compare_shapes_neq_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_compare_shapes_tuple_eq_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_compare_tensor_with_none_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_compilation_metrics_size_limit_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cond_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cond_export_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cond_nested_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_conditional_list_comp_in_context_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_config_getattr_default_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cross_entropy_loss_fancy_ctor2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cross_entropy_loss_simple_ctor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_data_access_in_inference_mode_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_data_ptr_graph_break_aten_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_data_ptr_graph_break_builtin_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_default_dtype_change_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_deque_input_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_deterministic_algorithms_mutated_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dictcomp_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dunder_methods_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dunder_new_function_inlining2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dunder_new_function_inlining3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dunder_weakref_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_duplicate_graph_break_log_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dynamic_one_hot_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dynamic_shapes_as_strided_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dynamic_sources_dynamic_override_regex_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dynamic_sources_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dynamic_sources_int_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dynamo_cache_invalidate_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dynamo_cache_move_to_front_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dynamo_compiling_fake_tensor_to_vararg_int_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dynamo_disabled_in_custom_op_kernels_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_empty_list_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_enum_as_dict_key_with_overloaded_str_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_enum_guards_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_enum_no_graphbreaks_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_enum_subclass_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_error_on_nested_fx_trace_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_error_on_recompile_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_escaping_closure_var_with_backward_hook_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_escaping_closure_var_with_nonlocal_var_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_fail_on_recompile_error_message_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_flat_name_to_original_fqn_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_fn_hasattr__name__1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_fn_hasattr__name__2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_fn_hasattr__name__3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_free_var_and_local_name_collision_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_frozen_dataclass_default_factory_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_frozen_dataclass_default_value_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_frozen_dataclass_hashable_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_frozen_dataclass_kw_only_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_fullgraph_capture_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_funcname_cache_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_generate_trivial_abstract_impl_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_get_cache_entry_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_getset_descriptor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_global_state_guard_serialization_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_grad_none_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_graph_break_compilation_metrics_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_graph_break_compilation_metrics_on_failure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_graph_break_correctly_when_passing_numpy_ndarray_to_torch_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_failure_fn2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_failure_fn_shape_control_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_filter_fn_by_id_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_filter_fn_by_name_and_value_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_filter_globals_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_size_oblivious_backed_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_size_oblivious_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guards_cse_pass_multiple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guards_strip_function_call_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_hash_hop_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_id_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_if_cond_nn_mod1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_if_cond_nn_mod3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inference_mode_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inline_closure_not_loaded_by_parent_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inline_closure_returned_by_another_function_and_captures_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inline_dict_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inline_dict_mutation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inline_func_jump_on_tensor_condition_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inline_list_mutation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inline_local_dict_clear_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inplace_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inplace_view_on_graph_input_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inspect_signature_bind_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_int_shape_comparisons_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_int_shape_inplace_binops_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_intermediary_tensor_grad_access_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_invalid_args_builtin_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_is_floating_point_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_is_tensor_like2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_is_tensor_like_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_item_changes_new_shape_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_iter_set_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_iter_type_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_iterator_limit_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_itertools_accumulate_symint_default_sum_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_itertools_accumulate_tensors_default_sum_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_itertools_accumulate_tensors_kwargs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_itertools_groupby_pure_python_default_identify_func_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_itertools_infinite_count_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_itertools_infinite_repeat_mutation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_itertools_islice_default_step_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_itertools_islice_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_itertools_repeat_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_itertools_tee_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_large_reduction_list_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_linear_module_free_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_list_append_return_none_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_list_iadd_side_effect_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_list_iadd_with_shape_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_list_slice_mul_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_listcomp_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_load_fast_and_clear_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_map_with_quantization_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_mark_dynamic_with_ranges_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_module_complex_iter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_module_deepcopy_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_module_not_callable_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_multiple_inheritance_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_mutable_mapping_multiple_inheritance_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_namedtuple2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_namedtuple_class_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_namedtuple_with_custom_getitem_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_ne_operator_with_custom_eq_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_ne_operator_with_custom_ne_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nested_closure_mutation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nested_dataclass_reconstruct_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nested_frozen_dataclass_hashable_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nested_function_resuming_with_correct_globals_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nested_optimize_decorator_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nested_optimize_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nested_optimize_run_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nested_sequential_try_with_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nested_wraps_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nn_functional_reduction_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nn_module_getattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nn_sequential_invocation_reposition_indices_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_no_guard_for_unused_sym_node_fstring_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_no_raise_guard_partial_constraint_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_as_global_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_fallback_on_eager_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_force_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_gt_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_iter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_ndarray_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_non_torch_dtype_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_random_config_to_numpy_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_readonly_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_recompilation_scalar_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_take_along_axis_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_tolist_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_torch_operators_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_variable_isinstance_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_with_builtin_type_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_object_classmethod_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_object_staticmethod_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_onnx_shape_as_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_optimize_on_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_ordered_dict_alias_reconstruct_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_os_environ_get_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_os_environ_set_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_out_variants_with_resizing_on_graph_inputs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_out_variants_with_resizing_on_graph_inputs_with_dynamic1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_out_variants_with_resizing_on_graph_inputs_with_dynamic_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_outside_linear_module_free_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_packaging_version_parse_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_pair_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_param_shape_binops_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_parameter_free_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_pep0479_convert_stopiteration_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_precompile_entries_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_precompile_fail_on_recompile_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_pt2_compliant_overload_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_pure_python_accumulate_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_py_guards_mark_dynamic_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_python_slice_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_raise_guard_full_constraint_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_raise_guard_partial_constraint_across_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_raise_guard_partial_constraint_no_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_raise_on_backend_error_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_raises_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_range_input_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_range_iter_side_effects_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_range_with_shape_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_real_imag_tensor_attribute_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_recompile_message_on_parameter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_recompile_on_global_state_change_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_reconstruct_frozen_dataclass_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_recursion_depth_guards_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_recursive_inline_list_mutation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_release_input_memory_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_remove_set_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_repro_graph_breaks_in__get_item_by_idx_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_restore_graphstate_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_returning_func_with_captured_func_and_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_running_func_with_captured_func_and_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_sample_input_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_scalar_device_movement_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_scalar_tensor_is_equivalent_to_symint_argument_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_scalar_tensor_is_equivalent_to_symint_list_argument_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_set_aliasing_recompiles_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_set_descriptor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_set_update_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_setattr_mutation2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_setattr_mutation3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_shape_and_tuple_equality_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_shape_env_equal_empty_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_shape_env_equal_evaluate_expr_divisible_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_shape_env_equal_evaluate_expr_refinement_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_shape_env_equal_evaluate_expr_replacement_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_shape_env_equal_runtime_assert_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_shape_env_no_recording_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_shape_int_inplace_binops_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_side_effects_codegen_update_mutated_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_simple_set_usage_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_size_dim_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_sourceless_namedtuple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_storage_return_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_str_format_assert2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_str_format_return2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_super_calling_with_metaclass_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_sym_and_terms_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_sym_max_unbacked_sizelike_simplification_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_symint_as_device_kwarg_non_strict_export_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_sys_modules_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tagging_tensors_mix_used_unused_structure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_data_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_dict1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_dict2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_dict3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_dynamic_method_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_interacts_with_numpy_ndarray_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_is_contiguous_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tolist_kd_dynamic_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tolist_kd_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tolist_scalar_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_compile_ctx_on_forward_and_training_step_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_guards_stack_frame_register_inlining_deep_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_guards_stack_frame_register_inlining_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_nn_parameter_isinstance_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_package_working_with_trace_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_seed_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_size_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_size_numel_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_variable_hasattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tracing_nested_py_tree_tuples_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tracing_py_tree_tensor_subclass_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tuple_class_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tuple_from_tuple_iter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tuple_hasattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tuple_mul_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_type_copy_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_typing_dict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_typing_union_and_optional_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_typing_variable_isinstance_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_unbacked_2d_expand_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_unbacked_sources_scalar_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_unbacked_sources_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_unbacked_symint_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_unhandled_exception_in_dynamo_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_unpack5_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_unpack_tensor_shape_mismatch_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_user_code_statically_known_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_user_defined_binop_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_user_defined_class_python_type_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_user_defined_setattr1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_user_function_variable_supports_function_argument_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_user_function_variable_supports_type_abcmeta_argument_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_user_getattr1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_user_getattr2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_user_getattribute_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_usr_cls_classmethod_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_variable_tracker_recursively_contains_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_version_ci_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_write_to_cells_with_name_shadowing_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_write_to_closures_in_inlining_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_writes_to_cells_across_frames2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_yield_from_in_a_loop_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_yield_from_user_stop_iteration_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_yield_send_to_subgenerator_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_Size_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_add_sub_alpha_out_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_batch_encoding_clone_inputs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_batchnorm_e2e_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_changing_stride_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_class_member_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_classmethod_with_slots_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_compilation_metrics_on_error_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_compile_complex_conj_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_compile_copy__int_overload_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_const_dict_keyerror_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_data_attr_mutation_after_saved_for_bw_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_dataclass_init_with_default_factory_with_inputs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_ddp_checkpoint_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_dedup_global_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_deleted_compile_wrapper_segfault_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_dynamic_shapes_float_guard_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_dynamic_shapes_right_side_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_ellipsis_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_embedding_backward_broadcasting_decomp_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_empty_graph_nested_calls_fullgraph_False_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_empty_list_contains_with_jump_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_ephemeral_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_error_return_without_exception_set_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_exec_wildcard_import_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_for_loop_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_get_parameter_dtype_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_grad_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_grad_mode_carrying_correct_state_after_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_graph_break_on_jit_isinstance_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_graph_break_on_jit_isinstance_pep585_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_graph_break_unsupported_fake_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_guard_fail_nested_tuple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_guard_fail_tensor_bool_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_guard_ordering_shape_fail_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_hf_bigbird_unsqueeze_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_hf_classinstantier_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_hf_gelu_inline_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_hf_model_output_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_hf_xsoftmax_inference_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_iadd_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_incompatible_configs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_indexing_with_list_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_inductor_rng_default_dtype_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_inlining_cornercase_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_is_make_fx_tracing_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_is_symbolic_tracing_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_issue111522_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_issue111918_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_issue1466_size_aot_autograd_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_jit_trace_errors_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_kwargs_out_list_variable_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_list_aliasing_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_list_index_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_listcomp_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_maml_item_capture_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_many_views_with_mutation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_map_with_multiple_args_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_merge_criteria_processor_list1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_method_overriding_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_multi_import_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_named_buffers_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_negative_floor_div_solve_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_nested_while_loop_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_nn_module_stack_bc_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_nonconst_issubclass_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_nullcontext1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_nullcontext2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_numpy_tobytes_no_error_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_omegaconf_listconfig_iter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_ones_out_dynamic_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_optim_state_references_cleared_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_optimized_deepcopy_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_optimized_module_patched_init_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_optimized_module_training_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_out_nested_cell_shape_change_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_out_nested_cell_tuple_shape_change_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_out_root_cell_shape_change_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_out_root_cell_tuple_shape_change_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_overlapping_inputs_with_dynamic_shapes_error_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_partitioner_cse_respects_mutation_boundaries_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_primtorch_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_primtorch_no_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_recursive_map_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_reformer_eval_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_reformer_train_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_relative_import_no_modulename_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_requires_grad_guards_with_grad_mode1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_return_value_duplication_mixed_grad_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_return_value_duplication_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_rewrite_assert_noop_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_rewrite_assert_with_msg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_rng_state_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_setattr_requires_grad_graph_breaks_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_setitem_boolean_mask_diff_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_setitem_tuple_boolean_mask_diff_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_sigmoid_out2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_size_typematch_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_slice_into_list_mutable_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_sort_out2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_sort_out_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_split_with_sizes_aot_autograd_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_stk_sdd_is_transposed_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_stop_iteration_reconstruct_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_super_classmethod_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_super_classmethod_inheritance_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_super_diamond_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_super_in_staticmethod_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_swin_base_tensor_attr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_symnode_is_not_op_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_symnode_is_op_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_sys_monitoring_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_data_kwarg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_item_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_set_data_backend_aot_eager_func_name_func1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_set_data_backend_aot_eager_func_name_func2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_set_data_backend_aot_eager_func_name_func3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_set_data_backend_eager_func_name_func1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_set_data_backend_eager_func_name_func2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_set_data_backend_inductor_func_name_func1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_set_data_backend_inductor_func_name_func2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_set_data_backend_inductor_func_name_func3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_torch_ops_aten_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_torch_tensor_ops_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_torch_variable_type_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_torchname_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_trace_functional_tensor_with_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tuple_enum_as_key_dict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_typed_dict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_typed_dict_total_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_udf_classes_reconstruction_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_unbind_copy_out_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_unpack_hooks_dont_run_during_tracing_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_user_ctor_ctx_manager_custom_init_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_user_ctor_ctx_manager_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_user_defined_object_callable_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_validate_model_kwargs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_weakref_callback_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_weakref_construction_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_weakref_del_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_weakref_reconstruct_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_while_loop_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_while_loop_graph_break_inside_call_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_access_by_keys_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_children_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_constloop_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_conv_call_super_forward_directly_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_conv_transpose_call_forward_directly_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_conv_transpose_call_super_forward_directly_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_densenet_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_enumvalues_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_fnmembercmp1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_fnmembercmp2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_generation_tag_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_iseval1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_istraining1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_lazy_module2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_lazy_module4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_lazy_module6_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_lazy_module_speculation_log_divergence_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_module_attribute_precedence_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_module_call_module_with_static_forward_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_module_name_string_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_module_static_method_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_moduledict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_modulelist_custom_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_modulelist_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_modulelist_nested_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_modulemethod1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_modulemethod2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_named_children_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_nn_module_unspec_int_attr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_nn_moduledict_contains_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_parameters1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_parameters2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_self_mutating1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_sequential_with_duplicated_module2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_simple_torch_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_submodules1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_super1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_torch_function_with_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_byte_tensor_does_not_crash_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_capture_symbolic_tracing_simple_within_fake_mode_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_capture_symbolic_tracing_within_fake_mode_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_cond_raise_user_error_on_mismatch_return_length_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_cond_raise_user_error_on_missing_args_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_cond_raise_user_error_on_non_list_operands_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dataclass_input_output_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dupes_2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dupes_2_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dupes_and_bypass_reorder_with_non_tensor_arg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dupes_and_bypass_with_non_tensor_arg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dupes_and_bypass_with_non_tensor_arg_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dupes_and_bypass_with_non_tensor_output_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dupes_and_bypass_with_non_tensor_output_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dupes_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dynamic_slicing_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dynamic_slicing_invalid_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_empty_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_compare_optimize_with_make_fx_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_control_flow_with_getattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_decomp_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_dynamic_dim_cleanup_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_graph_bypass_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_graph_bypass_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_graph_with_complex_reorder_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_masking_with_no_grad_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_mismatched_out_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_multi_dynamic_dim_constraint_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_multi_dynamic_dim_unsafe_relationship_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_no_raise_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_pass_arg_by_name_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_pass_arg_by_name_star_args_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_persist_assert_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_preserve_constraints_as_metadata_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_preserves_nn_module_stack_for_get_attr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_raise_guard_full_constraint_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_raise_on_relationship_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_shape_control_flow_1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_specialized_int_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_symbolic_shape_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_args_with_default_float_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_args_with_default_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_cond_branches_calling_methods_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_cond_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_dict_values_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_free_function_and_class_method_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_method_on_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_method_on_module_invoke_twice_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_not_none_control_flow_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_not_none_control_flow_pos_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_functools_wrapped_fn_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_kwargs_with_default_None_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_kwargs_with_default_float_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_kwargs_with_default_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_kwargs_with_default_tuple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_map_cond_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_stack_trace_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_symbool_inputs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_immutable_list_dict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_input_container_type_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_invalid_input_global_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_invalid_input_unused_nonlocal_ok_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_list_contains_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_list_unpack_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_multiple_outputs_op_with_evaluator_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_no_tensor_computation_2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_no_tensor_computation_2_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_predispatch_with_for_out_dtype_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_predispatch_with_for_out_dtype_nested_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_predispatch_with_higher_order_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_preserve_fx_node_metadata_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_preserve_fx_node_metadata_inline_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_preserve_fx_node_metadata_recompile_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_remove_redundant_dynamic_dim_in_error_message_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_retracibility_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_retracibility_nested_list_out_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_round_dynamic_shapes_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_sym_contains_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_symbolic_tracing_within_fake_mode_with_constraints_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_symbolic_tracing_within_fake_mode_with_constraints_with_parameters_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_symbool_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_torch_inference_mode_ctx_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_zeroes_in_and_out_different_shape_on_test_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_zeroes_in_new_shape_scalar_out_permute_dupe_and_bypass_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_zeroes_in_new_shape_scalar_out_permute_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_control_flow1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_dynamic_getitem_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_extended_args_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_graph_break_on_item_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_indirect_unsupported3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_multigraph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_restore_state_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_resume2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_resume3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_resume_paths_join_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_resume_with_no_grad1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_stack_state1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_start1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_start4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_tuple_iterator_mutate_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_access_module_attr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_capture_constants_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_capture_global_num_adds_guard_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_capture_input_num_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_capture_tracked_nested_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_capture_untracked_global_nested_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_concat_unbacked_shape_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_cond_branches_no_arguments_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_cond_free_variable_in_both_branches_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_cond_graph_break_in_one_branch_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_cond_source_fn_stack_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_cond_subgraph_name_is_valid_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_cond_with_empty_operands_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_dynamic_shapes_over_vmap_batch_size_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_fallback_on_graph_break_simple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_fallback_on_python_primitives_output_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_hints_wrapper_incorrect_type_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_hopify_generic_wrap_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_lift_tensors_with_shared_symbols_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_make_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_map_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_map_kwargs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_map_lowers_to_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_map_side_effect_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_map_source_fn_stack_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_map_subgraph_name_is_valid_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_modules_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_nested_wrap_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_no_freevars_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_output_with_dict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_register_subclass_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_same_freevar_twice_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_del_existing_attr_global_obj_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_del_existing_attr_nonlocal_obj_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_in_body_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_local_list_append_no_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_mutate_global_num_builtin_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_mutate_global_num_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_mutate_nonlocal_num_builtin_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_mutate_nonlocal_num_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_mutate_nonlocal_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_nested_nonlocal_list_append_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_nonlocal_list_append_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_set_new_attr_global_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_set_new_attr_global_obj_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_tensor_with_unbacked_shape_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_unbacked_symbol_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_vmap_multiply_scalar_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_wrap_kwarg_default_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_wrap_kwarg_default_if_branch_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_wrap_kwarg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_wrap_kwarg_only_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_wrap_kwarg_recompile_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_wrap_pytree_args_nested_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_wrap_pytree_kwargs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_wrap_source_fn_stack_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_wrap_subgraph_name_is_valid_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_functional_call_disable_inline_nn_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_capture_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_freevar_python_scalar_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_non_tensor_input_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_pytree_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_recompile_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_with_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_with_side_effect_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jacfwd_randomness_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jacrev_two_tensors_argnums_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jvp_call_torch_compile_fn_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jvp_has_aux_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jvp_jvp_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jvp_two_tensors_disable_grad_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jvp_two_tensors_has_aux_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_linearize_jvp_fn_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vjp_call_compiled_backward_fn_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vjp_has_aux_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vjp_multiple_outputs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vjp_multiple_outputs_python_struct_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_call_compiled_backward_fn_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_call_torch_compile_fn_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_free_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_multiple_invocation_in_dims_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_multiple_outputs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_new_tensor_unused_in_body_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_over_vmap_captured_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_previous_illegal_op_no_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_pytree_inputs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_recompile_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_side_effects_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_two_inputs_tuple_in_dims_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_with_conditional_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_with_graph_break_2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_with_graph_break_lambda_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_alias_inputs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_aot_autograd_expand_mutation_backwards_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_aot_autograd_expand_mutation_error_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_aot_autograd_raises_invalid_leaf_set_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_aot_sequence_nr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_arg_dupe_via_dynamo_recompiles_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_arg_dupe_via_dynamo_recompiles_many_with_global_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_autograd_function_tangent_mutation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_data_ptr_access_copy_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_data_ptr_access_fails_in_backward_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_donated_buffer1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_donated_buffer4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_donated_buffer_with_retain_or_create_graph1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_double_backward_errors_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_eager_sequence_nr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_grad_inputs_alias_inputs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_inputs_overlapping_with_mutation_recompile_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_inputs_overlapping_with_mutation_stress_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_multiple_aot_autograd_calls_dupe_args_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_mutation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_negative_testing_mutation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_nn_parameter_construction_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_no_storage_overlap_guards_no_aliasing_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_no_storage_overlap_guards_no_mutation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_split_with_sizes_aot_autograd_cleans_up_traceback_meta_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesTestSDPA::test_graph_break_SDPAParams_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesTestSDPA::test_input_SDPAParams_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesTestSDPA::test_returns_SDPAParams_dynamic_shapes 2025-08-14T22:13:37.4232198Z 2025-08-14T22:13:37.9728997Z Uploading artifacts took 0.63 seconds 2025-08-14T22:13:41.5183751Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:13:41.5185083Z import pkg_resources 2025-08-14T22:13:41.9264421Z Running dynamo/test_unspec 1/1 ... [2025-08-14 22:13:41.925925] 2025-08-14T22:13:41.9265381Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:13:41.9268944Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_unspec.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:13:41.926362] 2025-08-14T22:13:46.8008731Z 2025-08-14T22:13:46.8009838Z dynamo/test_unspec 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_unspec_1.1_c7274b4f8cf7bf0a_.log 2025-08-14T22:13:46.8024687Z Running 51 items in this shard: test/dynamo/test_unspec.py::UnspecTests::test_argmin_coerces_symint_to_intlist_spec, test/dynamo/test_unspec.py::UnspecTests::test_bool_tensor_ctor, test/dynamo/test_unspec.py::UnspecTests::test_builtin_getitem, test/dynamo/test_unspec.py::UnspecTests::test_builtin_max_min, test/dynamo/test_unspec.py::UnspecTests::test_compiled_random_calls_are_random, test/dynamo/test_unspec.py::UnspecTests::test_conv1d_symint_padding, test/dynamo/test_unspec.py::UnspecTests::test_data_dependent_evaluate_expr_graph_break, test/dynamo/test_unspec.py::UnspecTests::test_defaults, test/dynamo/test_unspec.py::UnspecTests::test_exponential, test/dynamo/test_unspec.py::UnspecTests::test_feed_random_values_into_graph_only, test/dynamo/test_unspec.py::UnspecTests::test_isinstance_symint, test/dynamo/test_unspec.py::UnspecTests::test_item_max, test/dynamo/test_unspec.py::UnspecTests::test_mark_01_dynamic, test/dynamo/test_unspec.py::UnspecTests::test_mark_static_inside, test/dynamo/test_unspec.py::UnspecTests::test_mark_unbacked, test/dynamo/test_unspec.py::UnspecTests::test_mark_unbacked_channels_last, test/dynamo/test_unspec.py::UnspecTests::test_mark_unbacked_hint_consistency, test/dynamo/test_unspec.py::UnspecTests::test_multiple_consecutive_random_calls_before_graph, test/dynamo/test_unspec.py::UnspecTests::test_no_recompilations, test/dynamo/test_unspec.py::UnspecTests::test_no_recompilations_with_efficient_attention, test/dynamo/test_unspec.py::UnspecTests::test_no_recompiles, test/dynamo/test_unspec.py::UnspecTests::test_no_recompiles_prod_backward, test/dynamo/test_unspec.py::UnspecTests::test_numpy_correctness, test/dynamo/test_unspec.py::UnspecTests::test_propagate_dynamic_dim, test/dynamo/test_unspec.py::UnspecTests::test_prune_torch_check, test/dynamo/test_unspec.py::UnspecTests::test_random_call_with_while_loop, test/dynamo/test_unspec.py::UnspecTests::test_random_object, test/dynamo/test_unspec.py::UnspecTests::test_random_object_methods, test/dynamo/test_unspec.py::UnspecTests::test_random_object_overridden_methods, test/dynamo/test_unspec.py::UnspecTests::test_random_values_with_graph_break, test/dynamo/test_unspec.py::UnspecTests::test_rshift_dynamic, test/dynamo/test_unspec.py::UnspecTests::test_shape_graph_break, test/dynamo/test_unspec.py::UnspecTests::test_specializing_numpy_float_in_control_flow, test/dynamo/test_unspec.py::UnspecTests::test_split_aot_autograd, test/dynamo/test_unspec.py::UnspecTests::test_sum_dimlist_spec, test/dynamo/test_unspec.py::UnspecTests::test_sym_int_conversion, test/dynamo/test_unspec.py::UnspecTests::test_symbol_guard_limit_before_specialize, test/dynamo/test_unspec.py::UnspecTests::test_symfloat_no_replacement, test/dynamo/test_unspec.py::UnspecTests::test_symfloat_to_tensor, test/dynamo/test_unspec.py::UnspecTests::test_tensorfiy_python_scalars_1, test/dynamo/test_unspec.py::UnspecTests::test_tensorfiy_python_scalars_2, test/dynamo/test_unspec.py::UnspecTests::test_tensorfiy_python_scalars_3, test/dynamo/test_unspec.py::UnspecTests::test_to_tensor, test/dynamo/test_unspec.py::UnspecTests::test_unspec_float_input, test/dynamo/test_unspec.py::UnspecTests::test_unspec_float_input_f64, test/dynamo/test_unspec.py::UnspecTests::test_unspec_float_output, test/dynamo/test_unspec.py::UnspecTests::test_unspec_float_precision, test/dynamo/test_unspec.py::UnspecTests::test_unspec_roundtrip_float_input, test/dynamo/test_unspec.py::UnspecTests::test_unspecialized_float_multiply_precision, test/dynamo/test_unspec.py::UnspecTests::test_use_and_specialize, test/dynamo/test_unspec.py::UnspecTestsDeviceCUDA::test_builtin_functions_on_device_cuda 2025-08-14T22:13:46.8037555Z 2025-08-14T22:13:50.9502397Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:13:50.9503735Z import pkg_resources 2025-08-14T22:13:51.3417940Z Running dynamo/test_repros 1/1 ... [2025-08-14 22:13:51.341198] 2025-08-14T22:13:51.3418372Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:13:51.3419454Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_repros.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:13:51.341647] 2025-08-14T22:13:56.8668830Z 2025-08-14T22:13:56.8669778Z dynamo/test_repros 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_repros_1.1_4d3d01defeadd90c_.log 2025-08-14T22:13:56.8751565Z Running 327 items in this shard: test/dynamo/test_repros.py::LRUCacheWarningTests::test_lru_cache_warning_issued_during_tracing, test/dynamo/test_repros.py::ReproTests::test_Size, test/dynamo/test_repros.py::ReproTests::test_abc_setattr, test/dynamo/test_repros.py::ReproTests::test_add_complex_conj, test/dynamo/test_repros.py::ReproTests::test_add_sub_alpha_out, test/dynamo/test_repros.py::ReproTests::test_addr_alpha_beta_out, test/dynamo/test_repros.py::ReproTests::test_amp_foreach_fake_impl, test/dynamo/test_repros.py::ReproTests::test_ao_fake_quantize_tracing, test/dynamo/test_repros.py::ReproTests::test_aot_autograd_runtime_wrapper_prologue_profiled, test/dynamo/test_repros.py::ReproTests::test_as_strided_on_base_with_mutation_works, test/dynamo/test_repros.py::ReproTests::test_as_strided_on_existing_view_banned, test/dynamo/test_repros.py::ReproTests::test_attached_attribute_in_dir, test/dynamo/test_repros.py::ReproTests::test_autograd_function_graph_break, test/dynamo/test_repros.py::ReproTests::test_avoid_dupe_specialization, test/dynamo/test_repros.py::ReproTests::test_batch_encoding_clone_inputs, test/dynamo/test_repros.py::ReproTests::test_batch_norm_act, test/dynamo/test_repros.py::ReproTests::test_batchnorm_e2e, test/dynamo/test_repros.py::ReproTests::test_bigbird_unsqueeze_inplace, test/dynamo/test_repros.py::ReproTests::test_bitwise_op_guard, test/dynamo/test_repros.py::ReproTests::test_bitwise_print_precedence, test/dynamo/test_repros.py::ReproTests::test_boxes_len, test/dynamo/test_repros.py::ReproTests::test_build_map_unpack_with_call, test/dynamo/test_repros.py::ReproTests::test_c_defined_metaclass, test/dynamo/test_repros.py::ReproTests::test_changing_stride, test/dynamo/test_repros.py::ReproTests::test_chunk_reformer_ff, test/dynamo/test_repros.py::ReproTests::test_class_member, test/dynamo/test_repros.py::ReproTests::test_classmethod_with_slots, test/dynamo/test_repros.py::ReproTests::test_compilation_metrics_on_error, test/dynamo/test_repros.py::ReproTests::test_compile_complex_conj, test/dynamo/test_repros.py::ReproTests::test_compile_copy__int_overload, test/dynamo/test_repros.py::ReproTests::test_const_dict_keyerror, test/dynamo/test_repros.py::ReproTests::test_contains_range_constprop, test/dynamo/test_repros.py::ReproTests::test_convert_boxes_to_pooler_format, test/dynamo/test_repros.py::ReproTests::test_copy_weird_strides, test/dynamo/test_repros.py::ReproTests::test_create_rand_mask_from_inputs, test/dynamo/test_repros.py::ReproTests::test_dalle2_maybe, test/dynamo/test_repros.py::ReproTests::test_data_attr_mutation_after_saved_for_bw, test/dynamo/test_repros.py::ReproTests::test_dataclass_in_module, test/dynamo/test_repros.py::ReproTests::test_dataclass_init_with_default_factory_with_inputs, test/dynamo/test_repros.py::ReproTests::test_ddp_checkpoint, test/dynamo/test_repros.py::ReproTests::test_dedup_global, test/dynamo/test_repros.py::ReproTests::test_deferred_runtime_asserts, test/dynamo/test_repros.py::ReproTests::test_delattr, test/dynamo/test_repros.py::ReproTests::test_delattr_raises, test/dynamo/test_repros.py::ReproTests::test_delattr_return, test/dynamo/test_repros.py::ReproTests::test_delete_local_error, test/dynamo/test_repros.py::ReproTests::test_deleted_compile_wrapper_segfault, test/dynamo/test_repros.py::ReproTests::test_delsubscr, test/dynamo/test_repros.py::ReproTests::test_delsubscr_raises, test/dynamo/test_repros.py::ReproTests::test_detectron2_instances_cat, test/dynamo/test_repros.py::ReproTests::test_disabling_unpack_hooks_within_compiled_region, test/dynamo/test_repros.py::ReproTests::test_distributions_subclass, test/dynamo/test_repros.py::ReproTests::test_do_paste_mask, test/dynamo/test_repros.py::ReproTests::test_dont_aggressively_write_assert, test/dynamo/test_repros.py::ReproTests::test_dropout_inline, test/dynamo/test_repros.py::ReproTests::test_dynamic_shape_disable_duck_size, test/dynamo/test_repros.py::ReproTests::test_dynamic_shapes_double_not_equal, test/dynamo/test_repros.py::ReproTests::test_dynamic_shapes_float_guard, test/dynamo/test_repros.py::ReproTests::test_dynamic_shapes_implicit_guard, test/dynamo/test_repros.py::ReproTests::test_dynamic_shapes_right_side, test/dynamo/test_repros.py::ReproTests::test_ellipsis, test/dynamo/test_repros.py::ReproTests::test_embedding_backward_broadcasting_decomp, test/dynamo/test_repros.py::ReproTests::test_empty_graph_nested_calls_fullgraph_False, test/dynamo/test_repros.py::ReproTests::test_empty_graph_nested_calls_fullgraph_True, test/dynamo/test_repros.py::ReproTests::test_empty_list_contains_with_jump, test/dynamo/test_repros.py::ReproTests::test_empty_out_dynamic, test/dynamo/test_repros.py::ReproTests::test_enum, test/dynamo/test_repros.py::ReproTests::test_ephemeral_module, test/dynamo/test_repros.py::ReproTests::test_error_return_without_exception_set, test/dynamo/test_repros.py::ReproTests::test_exception_in_dynamo_handling, test/dynamo/test_repros.py::ReproTests::test_exec_import, test/dynamo/test_repros.py::ReproTests::test_exec_wildcard_import, test/dynamo/test_repros.py::ReproTests::test_flip_bad_accuracy, test/dynamo/test_repros.py::ReproTests::test_for_loop_graph_break, test/dynamo/test_repros.py::ReproTests::test_for_loop_graph_break_before, test/dynamo/test_repros.py::ReproTests::test_foreach_decomp_arg_names, test/dynamo/test_repros.py::ReproTests::test_fsdp_set_input_mutation_applied_when_input_gets_no_gradients, test/dynamo/test_repros.py::ReproTests::test_function_in_skipfiles, test/dynamo/test_repros.py::ReproTests::test_functools_wraps, test/dynamo/test_repros.py::ReproTests::test_gan_repro_trying_to_backward_through_the_graph_a_second_time, test/dynamo/test_repros.py::ReproTests::test_generator_dealloc, test/dynamo/test_repros.py::ReproTests::test_get_parameter_dtype, test/dynamo/test_repros.py::ReproTests::test_global_fn_mutation, test/dynamo/test_repros.py::ReproTests::test_grad, test/dynamo/test_repros.py::ReproTests::test_grad_mode_carrying_correct_state_after_graph_break, test/dynamo/test_repros.py::ReproTests::test_grad_references_cleared, test/dynamo/test_repros.py::ReproTests::test_graph_break_on_jit_isinstance, test/dynamo/test_repros.py::ReproTests::test_graph_break_on_jit_isinstance_pep585, test/dynamo/test_repros.py::ReproTests::test_graph_break_unsupported_fake, test/dynamo/test_repros.py::ReproTests::test_guard_default_device, test/dynamo/test_repros.py::ReproTests::test_guard_fail_nested_tuple, test/dynamo/test_repros.py::ReproTests::test_guard_fail_tensor_bool, test/dynamo/test_repros.py::ReproTests::test_guard_ordering_shape_fail, test/dynamo/test_repros.py::ReproTests::test_guard_with_tuple_mutation, test/dynamo/test_repros.py::ReproTests::test_hasattr_builtin, test/dynamo/test_repros.py::ReproTests::test_hf_bigbird_unsqueeze, test/dynamo/test_repros.py::ReproTests::test_hf_classinstantier, test/dynamo/test_repros.py::ReproTests::test_hf_gelu_inline, test/dynamo/test_repros.py::ReproTests::test_hf_model_output, test/dynamo/test_repros.py::ReproTests::test_hf_t5_forward, test/dynamo/test_repros.py::ReproTests::test_hf_xsoftmax_inference, test/dynamo/test_repros.py::ReproTests::test_hf_xsoftmax_training, test/dynamo/test_repros.py::ReproTests::test_iadd_graph_break, test/dynamo/test_repros.py::ReproTests::test_incompatible_configs, test/dynamo/test_repros.py::ReproTests::test_indexing_with_list, test/dynamo/test_repros.py::ReproTests::test_inductor_dynamic_shapes_broadcasting, test/dynamo/test_repros.py::ReproTests::test_inductor_no_recursionerror_on_for_loops, test/dynamo/test_repros.py::ReproTests::test_inductor_rng_default_dtype, test/dynamo/test_repros.py::ReproTests::test_inference_mode_dynamic_shapes, test/dynamo/test_repros.py::ReproTests::test_inlining_cornercase, test/dynamo/test_repros.py::ReproTests::test_inplace_unsqueeze_input, test/dynamo/test_repros.py::ReproTests::test_int_format, test/dynamo/test_repros.py::ReproTests::test_intermediate_leaf_requires_grad, test/dynamo/test_repros.py::ReproTests::test_invalid_seq_unpack, test/dynamo/test_repros.py::ReproTests::test_is_make_fx_tracing, test/dynamo/test_repros.py::ReproTests::test_is_symbolic_tracing, test/dynamo/test_repros.py::ReproTests::test_isinstance_dtype, test/dynamo/test_repros.py::ReproTests::test_isinstance_storage, test/dynamo/test_repros.py::ReproTests::test_issue111522, test/dynamo/test_repros.py::ReproTests::test_issue111918, test/dynamo/test_repros.py::ReproTests::test_issue114171, test/dynamo/test_repros.py::ReproTests::test_issue126128, test/dynamo/test_repros.py::ReproTests::test_issue134451, test/dynamo/test_repros.py::ReproTests::test_issue1466_size_aot_autograd, test/dynamo/test_repros.py::ReproTests::test_issue175, test/dynamo/test_repros.py::ReproTests::test_jit_script_defaults, test/dynamo/test_repros.py::ReproTests::test_jit_trace_errors, test/dynamo/test_repros.py::ReproTests::test_kwargs_out_list_variable, test/dynamo/test_repros.py::ReproTests::test_list_aliasing, test/dynamo/test_repros.py::ReproTests::test_list_index, test/dynamo/test_repros.py::ReproTests::test_list_index_not_found, test/dynamo/test_repros.py::ReproTests::test_list_index_tensor_unsupported, test/dynamo/test_repros.py::ReproTests::test_list_reverse, test/dynamo/test_repros.py::ReproTests::test_list_self_reference, test/dynamo/test_repros.py::ReproTests::test_listcomp, test/dynamo/test_repros.py::ReproTests::test_longformer_chunk, test/dynamo/test_repros.py::ReproTests::test_longtensor_list, test/dynamo/test_repros.py::ReproTests::test_lru_cache_tracing, test/dynamo/test_repros.py::ReproTests::test_maml_item_capture, test/dynamo/test_repros.py::ReproTests::test_maml_no_item_capture, test/dynamo/test_repros.py::ReproTests::test_many_overlapping_inputs_does_not_explode_guards, test/dynamo/test_repros.py::ReproTests::test_many_views_with_mutation, test/dynamo/test_repros.py::ReproTests::test_map_with_multiple_args, test/dynamo/test_repros.py::ReproTests::test_maybe_multiply_symint, test/dynamo/test_repros.py::ReproTests::test_merge_criteria_processor_list1, test/dynamo/test_repros.py::ReproTests::test_merge_criteria_processor_list2, test/dynamo/test_repros.py::ReproTests::test_method_overriding, test/dynamo/test_repros.py::ReproTests::test_module_in_skipfiles, test/dynamo/test_repros.py::ReproTests::test_modules, test/dynamo/test_repros.py::ReproTests::test_multi_dot_import, test/dynamo/test_repros.py::ReproTests::test_multi_import, test/dynamo/test_repros.py::ReproTests::test_named_buffers, test/dynamo/test_repros.py::ReproTests::test_nanmean_out, test/dynamo/test_repros.py::ReproTests::test_negative_floor_div_solve, test/dynamo/test_repros.py::ReproTests::test_negative_shape_guard, test/dynamo/test_repros.py::ReproTests::test_nested_while_loop_graph_break, test/dynamo/test_repros.py::ReproTests::test_nn_module_callable, test/dynamo/test_repros.py::ReproTests::test_nn_module_property_closure, test/dynamo/test_repros.py::ReproTests::test_nn_module_stack_bc, test/dynamo/test_repros.py::ReproTests::test_nn_param_freevar_codegen, test/dynamo/test_repros.py::ReproTests::test_nn_parameter, test/dynamo/test_repros.py::ReproTests::test_nn_parameter_ctor_graph_breaks, test/dynamo/test_repros.py::ReproTests::test_nn_parametrize, test/dynamo/test_repros.py::ReproTests::test_no_grad_inline, test/dynamo/test_repros.py::ReproTests::test_no_tracing_into_eval_frame, test/dynamo/test_repros.py::ReproTests::test_no_tracing_into_eval_frame_ctx_manager, test/dynamo/test_repros.py::ReproTests::test_nonconst_issubclass, test/dynamo/test_repros.py::ReproTests::test_not_rewrite_assert_for_other_errors, test/dynamo/test_repros.py::ReproTests::test_nullcontext1, test/dynamo/test_repros.py::ReproTests::test_nullcontext2, test/dynamo/test_repros.py::ReproTests::test_numpy_not_ndarray_recompiles, test/dynamo/test_repros.py::ReproTests::test_numpy_tobytes_no_error, test/dynamo/test_repros.py::ReproTests::test_odict_get_item_index_name, test/dynamo/test_repros.py::ReproTests::test_omegaconf_dictconfig, test/dynamo/test_repros.py::ReproTests::test_omegaconf_listconfig_contains, test/dynamo/test_repros.py::ReproTests::test_omegaconf_listconfig_iter, test/dynamo/test_repros.py::ReproTests::test_ones_out_dynamic, test/dynamo/test_repros.py::ReproTests::test_optim_state_references_cleared, test/dynamo/test_repros.py::ReproTests::test_optimized_deepcopy, test/dynamo/test_repros.py::ReproTests::test_optimized_module_patched_init, test/dynamo/test_repros.py::ReproTests::test_optimized_module_training, test/dynamo/test_repros.py::ReproTests::test_os_fspath, test/dynamo/test_repros.py::ReproTests::test_out_nested_cell_shape_change, test/dynamo/test_repros.py::ReproTests::test_out_nested_cell_tuple_shape_change, test/dynamo/test_repros.py::ReproTests::test_out_none, test/dynamo/test_repros.py::ReproTests::test_out_overload_non_contiguous, test/dynamo/test_repros.py::ReproTests::test_out_root_cell_shape_change, test/dynamo/test_repros.py::ReproTests::test_out_root_cell_tuple_shape_change, test/dynamo/test_repros.py::ReproTests::test_output_aliases_intermediate, test/dynamo/test_repros.py::ReproTests::test_overlapping_inputs_with_dynamic_shapes_error, test/dynamo/test_repros.py::ReproTests::test_overwriting_params, test/dynamo/test_repros.py::ReproTests::test_partially_initialized_module_property, test/dynamo/test_repros.py::ReproTests::test_partitioner_activation_memory_budget_with_unbacked_symints, test/dynamo/test_repros.py::ReproTests::test_partitioner_cse_respects_mutation_boundaries, test/dynamo/test_repros.py::ReproTests::test_pointless_graph_removal, test/dynamo/test_repros.py::ReproTests::test_primtorch, test/dynamo/test_repros.py::ReproTests::test_primtorch_no_graph_break, test/dynamo/test_repros.py::ReproTests::test_randint_out_dynamic, test/dynamo/test_repros.py::ReproTests::test_recursive_map, test/dynamo/test_repros.py::ReproTests::test_reformer_eval, test/dynamo/test_repros.py::ReproTests::test_reformer_min_chunk_len, test/dynamo/test_repros.py::ReproTests::test_reformer_sorting, test/dynamo/test_repros.py::ReproTests::test_reformer_train, test/dynamo/test_repros.py::ReproTests::test_reinplacing, test/dynamo/test_repros.py::ReproTests::test_relative_import, test/dynamo/test_repros.py::ReproTests::test_relative_import_no_modulename, test/dynamo/test_repros.py::ReproTests::test_requires_grad_guards_with_grad_mode1, test/dynamo/test_repros.py::ReproTests::test_requires_grad_guards_with_grad_mode2, test/dynamo/test_repros.py::ReproTests::test_restricted_list_subclass1, test/dynamo/test_repros.py::ReproTests::test_restricted_list_subclass2, test/dynamo/test_repros.py::ReproTests::test_restricted_list_subclass3, test/dynamo/test_repros.py::ReproTests::test_return_value_duplication_mixed_grad, test/dynamo/test_repros.py::ReproTests::test_return_value_duplication_scalar, test/dynamo/test_repros.py::ReproTests::test_return_value_duplication_tensor, test/dynamo/test_repros.py::ReproTests::test_return_weakref, test/dynamo/test_repros.py::ReproTests::test_rewrite_assert_dont_change_bytecode, test/dynamo/test_repros.py::ReproTests::test_rewrite_assert_noop, test/dynamo/test_repros.py::ReproTests::test_rewrite_assert_with_msg, test/dynamo/test_repros.py::ReproTests::test_rewrite_assert_with_non_string_msg, test/dynamo/test_repros.py::ReproTests::test_rewrite_assert_without_msg, test/dynamo/test_repros.py::ReproTests::test_rng_state, test/dynamo/test_repros.py::ReproTests::test_seq_append_list, test/dynamo/test_repros.py::ReproTests::test_setattr_requires_grad_graph_breaks, test/dynamo/test_repros.py::ReproTests::test_setitem_boolean_mask_diff, test/dynamo/test_repros.py::ReproTests::test_setitem_tuple_boolean_mask_diff, test/dynamo/test_repros.py::ReproTests::test_sigmoid_out, test/dynamo/test_repros.py::ReproTests::test_sigmoid_out2, test/dynamo/test_repros.py::ReproTests::test_size_typematch, test/dynamo/test_repros.py::ReproTests::test_slice_into_list_mutable, test/dynamo/test_repros.py::ReproTests::test_slicing_dynamic_shape, test/dynamo/test_repros.py::ReproTests::test_slicing_dynamic_shape_setitem, test/dynamo/test_repros.py::ReproTests::test_sort_out, test/dynamo/test_repros.py::ReproTests::test_sort_out2, test/dynamo/test_repros.py::ReproTests::test_specialized_stride, test/dynamo/test_repros.py::ReproTests::test_split_with_sizes_aot_autograd, test/dynamo/test_repros.py::ReproTests::test_staticmethod_allow_in_graph, test/dynamo/test_repros.py::ReproTests::test_stk_sdd_is_transposed, test/dynamo/test_repros.py::ReproTests::test_stop_iteration_reconstruct, test/dynamo/test_repros.py::ReproTests::test_str_isalnum, test/dynamo/test_repros.py::ReproTests::test_string_format, test/dynamo/test_repros.py::ReproTests::test_subclass_graph_output_repro, test/dynamo/test_repros.py::ReproTests::test_super_classmethod, test/dynamo/test_repros.py::ReproTests::test_super_classmethod_inheritance, test/dynamo/test_repros.py::ReproTests::test_super_diamond, test/dynamo/test_repros.py::ReproTests::test_super_in_staticmethod, test/dynamo/test_repros.py::ReproTests::test_super_staticmethod, test/dynamo/test_repros.py::ReproTests::test_swin_base_tensor_attr, test/dynamo/test_repros.py::ReproTests::test_symint_bitwise, test/dynamo/test_repros.py::ReproTests::test_symnode_is_not_op, test/dynamo/test_repros.py::ReproTests::test_symnode_is_op, test/dynamo/test_repros.py::ReproTests::test_sys_monitoring, test/dynamo/test_repros.py::ReproTests::test_tensor_data_kwarg, test/dynamo/test_repros.py::ReproTests::test_tensor_isinstance_tuple, test/dynamo/test_repros.py::ReproTests::test_tensor_item, test/dynamo/test_repros.py::ReproTests::test_tensor_random, test/dynamo/test_repros.py::ReproTests::test_tensor_set_data_backend_aot_eager_func_name_func1, test/dynamo/test_repros.py::ReproTests::test_tensor_set_data_backend_aot_eager_func_name_func2, test/dynamo/test_repros.py::ReproTests::test_tensor_set_data_backend_aot_eager_func_name_func3, test/dynamo/test_repros.py::ReproTests::test_tensor_set_data_backend_eager_func_name_func1, test/dynamo/test_repros.py::ReproTests::test_tensor_set_data_backend_eager_func_name_func2, test/dynamo/test_repros.py::ReproTests::test_tensor_set_data_backend_eager_func_name_func3, test/dynamo/test_repros.py::ReproTests::test_tensor_set_data_backend_inductor_func_name_func1, test/dynamo/test_repros.py::ReproTests::test_tensor_set_data_backend_inductor_func_name_func2, test/dynamo/test_repros.py::ReproTests::test_tensor_set_data_backend_inductor_func_name_func3, test/dynamo/test_repros.py::ReproTests::test_tensor_set_data_mismatched_dtype, test/dynamo/test_repros.py::ReproTests::test_tensor_split, test/dynamo/test_repros.py::ReproTests::test_tensor_split_within_device_cm, test/dynamo/test_repros.py::ReproTests::test_tensor_uniform, test/dynamo/test_repros.py::ReproTests::test_threading_local, test/dynamo/test_repros.py::ReproTests::test_tokenization, test/dynamo/test_repros.py::ReproTests::test_torch_compile_in_compile_frame, test/dynamo/test_repros.py::ReproTests::test_torch_ops_aten, test/dynamo/test_repros.py::ReproTests::test_torch_tensor_ops, test/dynamo/test_repros.py::ReproTests::test_torch_tensor_ops_no_graph_break, test/dynamo/test_repros.py::ReproTests::test_torch_variable_type, test/dynamo/test_repros.py::ReproTests::test_torchname, test/dynamo/test_repros.py::ReproTests::test_trace_functional_tensor_with, test/dynamo/test_repros.py::ReproTests::test_tuple_enum_as_key_dict, test/dynamo/test_repros.py::ReproTests::test_typed_dict, test/dynamo/test_repros.py::ReproTests::test_typed_dict_total, test/dynamo/test_repros.py::ReproTests::test_udf_classes_reconstruction, test/dynamo/test_repros.py::ReproTests::test_unbacked_arange_in_bounds, test/dynamo/test_repros.py::ReproTests::test_unbind_copy_out, test/dynamo/test_repros.py::ReproTests::test_unpack_hooks_can_be_disabled, test/dynamo/test_repros.py::ReproTests::test_unpack_hooks_dont_run_during_tracing, test/dynamo/test_repros.py::ReproTests::test_unspecialized_nn_module_with_torch_variable_attribute, test/dynamo/test_repros.py::ReproTests::test_unsqueeze_mul_strides, test/dynamo/test_repros.py::ReproTests::test_user_ctor_ctx_manager, test/dynamo/test_repros.py::ReproTests::test_user_ctor_ctx_manager_custom_init, test/dynamo/test_repros.py::ReproTests::test_user_ctor_ctx_manager_custom_init_graph_break, test/dynamo/test_repros.py::ReproTests::test_user_defined_iter, test/dynamo/test_repros.py::ReproTests::test_user_defined_object_callable, test/dynamo/test_repros.py::ReproTests::test_validate_model_kwargs, test/dynamo/test_repros.py::ReproTests::test_vc_bumped_in_inference_graph, test/dynamo/test_repros.py::ReproTests::test_vdd_duplicate_error, test/dynamo/test_repros.py::ReproTests::test_view_dtype_overload, test/dynamo/test_repros.py::ReproTests::test_weakref, test/dynamo/test_repros.py::ReproTests::test_weakref_callback, test/dynamo/test_repros.py::ReproTests::test_weakref_construction, test/dynamo/test_repros.py::ReproTests::test_weakref_del, test/dynamo/test_repros.py::ReproTests::test_weakref_reconstruct, test/dynamo/test_repros.py::ReproTests::test_while_loop_graph_break, test/dynamo/test_repros.py::ReproTests::test_while_loop_graph_break_inside_call_function, test/dynamo/test_repros.py::ReproTests::test_with_on_graph_break_inst, test/dynamo/test_repros.py::ReproTests::test_with_on_graph_break_nested, test/dynamo/test_repros.py::ReproTests::test_zeros_out_dynamic, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_data_dependent_error_log_no_print_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_deepcopy_constant_tensor_in_aot_bwd_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_filter_safe_grad_warning_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_filter_user_warnings_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_filter_warnings_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_flash_attn_backward_mixed_strides_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_getattr_return_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_guard_default_device_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_megablocks_moe_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_memleak_when_graph_input_has_tensor_attr_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_module_attribute_error_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_named_tuple_vt_clone_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_norm_dtype_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_partitioner_saves_weights_for_bw_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_sdpa_dynamic_shapes_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_sub_alpha_scalar_repro_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_tensor_size_hasattr_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_torch_cuda_is_initialized_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_truthiness_of_symints_no_recompiles_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_udf_class_source_cuda, test/dynamo/test_repros.py::ReproTestsDeviceCUDA::test_zero_dim_param_mixed_device_grad_cuda 2025-08-14T22:13:56.8831650Z 2025-08-14T22:14:01.0319836Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:14:01.0321276Z import pkg_resources 2025-08-14T22:14:01.4206090Z Running inductor/test_pattern_matcher 1/1 ... [2025-08-14 22:14:01.420043] 2025-08-14T22:14:01.4206618Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:14:01.4208461Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_pattern_matcher.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:14:01.420461] 2025-08-14T22:14:08.6008087Z 2025-08-14T22:14:08.6009320Z inductor/test_pattern_matcher 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_pattern_matcher_1.1_7f9e56c8e0a8bf85_.log 2025-08-14T22:14:08.6025142Z Running 47 items in this shard: test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_addmm, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_addmm_activation, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_addmm_broadcasting_bias, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_addmm_dtype_mismatch, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_addmm_symbolic_scalar, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_bmm_to_mm, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_cat_addmm, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_cat_mm, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_cat_slice_cat_cuda, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_cat_splitwithsizes, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_duplicate_search, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_fused_int_mm_mul, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_fused_int_mm_mul_epilogue, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_fused_int_mm_mul_gating, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_match_equivalent_function_invocations1, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_match_equivalent_function_invocations2, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_match_equivalent_function_invocations3, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_match_with_mutation, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_mixed_mm, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_mixed_mm_bad_cases, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_mixed_mm_cpu, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_mixed_mm_epi_works, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_mixed_mm_exhaustive_dtypes, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_mixed_mm_gating, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_mm_plus_mm, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_multioutput_register_replacement, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_mutation_op_matching, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_original_aten_preserved_split_addmm, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_pointless_convert, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_pointless_cumsum, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_pointless_permute_pair, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_pointless_permute_pair_3d, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_pointless_view_pair, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_pointless_view_pair_dynamic_shapes, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_remove_pointless_clones, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_replace_mul_zero, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_scaled_softmax, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_serialized_patterns_up_to_date, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_splitwithsizes_cat, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_stable_topological_sort, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_successful_partial_reuse_case0, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_successful_partial_reuse_case1, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_successful_partial_reuse_case2, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_symint_pattern_matching, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_unfuse_bias_addmm, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_unsuccessful_partial_reuse_case0, test/inductor/test_pattern_matcher.py::TestPatternMatcher::test_unsuccessful_partial_reuse_case1 2025-08-14T22:14:08.6039966Z 2025-08-14T22:14:12.7170080Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:14:12.7171393Z import pkg_resources 2025-08-14T22:14:13.0981437Z Running inductor/test_mkldnn_pattern_matcher 1/1 ... [2025-08-14 22:14:13.097603] 2025-08-14T22:14:13.0981976Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:14:13.0983516Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_mkldnn_pattern_matcher.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:14:13.098007] 2025-08-14T22:14:21.0289611Z 2025-08-14T22:14:21.0290591Z inductor/test_mkldnn_pattern_matcher 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_mkldnn_pattern_matcher_1.1_2c5b86ab1437d1d7_.log 2025-08-14T22:14:21.0435405Z Running 276 items in this shard: test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_conv2d_add_scalar, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_conv2d_binary_fusion_failed, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_conv2d_binary_inplace_fusion_failed_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_conv2d_binary_inplace_fusion_pass_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_False_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_False_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_False_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_False_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_False_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_False_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_False_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_False_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_True_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_True_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_True_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_True_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_True_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_True_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_True_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_True_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_False_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_False_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_False_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_False_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_False_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_False_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_False_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_False_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_True_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_True_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_True_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_True_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_True_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_True_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_True_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_True_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_False_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_False_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_False_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_False_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_False_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_False_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_False_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_False_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_True_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_True_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_True_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_True_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_True_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_True_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_True_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_True_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_False_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_False_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_False_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_False_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_False_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_False_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_False_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_False_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_True_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_True_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_True_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_True_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_True_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_True_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_True_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_True_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_False_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_False_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_False_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_False_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_False_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_False_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_False_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_False_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_True_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_True_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_True_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_True_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_True_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_True_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_True_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_True_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_False_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_False_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_False_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_False_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_False_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_False_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_False_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_False_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_True_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_True_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_True_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_True_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_True_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_True_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_True_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_True_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_False_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_False_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_False_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_False_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_False_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_False_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_False_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_False_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_True_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_True_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_True_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_True_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_True_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_True_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_True_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_True_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_False_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_False_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_False_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_False_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_False_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_False_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_False_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_False_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_True_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_True_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_True_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_True_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_True_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_True_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_True_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_True_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_dynamic_qlinear_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_dynamic_qlinear_input_dim_exceeds_2, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_dynamic_qlinear_qat_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_hardtanh_pattern_fallback, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_leaky_relu_pattern_fallback, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_linear_add_bias, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_linear_binary, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_linear_binary_broadcast_shapes, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_linear_dynamic_fp16, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_linear_fp32, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_linear_input_non_contiguous_3D_wo_bias, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_linear_relu_dynamic_fp16, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_linear_unary, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_multi_linear_share_same_input, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qat_qconv2d, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qat_qconv2d_add, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qat_qconv2d_add_relu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qat_qconv2d_hardswish, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qat_qconv2d_hardtanh, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qat_qconv2d_relu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qat_qconv2d_relu6, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qat_qconv2d_silu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qcat, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv1d_relu_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_2, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_3, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_broadcast_shapes_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_int8_mixed_bf16, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_relu_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_relu_int8_mixed_bf16, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_relu_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_relu_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_dequant_promotion_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_dequant_promotion_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_hardswish_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_hardswish_int8_mixed_bf16_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_hardswish_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_hardswish_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_hardtanh_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_hardtanh_int8_mixed_bf16_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_hardtanh_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_hardtanh_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_int8_mixed_bf16, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_relu6_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_relu6_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_relu_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_relu_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_relu_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_silu_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_silu_int8_mixed_bf16_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_silu_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_silu_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_with_concat_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qflatten, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_False_is_qat_False_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_False_is_qat_False_is_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_False_is_qat_True_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_False_is_qat_True_is_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_True_is_qat_False_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_True_is_qat_False_is_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_True_is_qat_True_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_True_is_qat_True_is_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_fp8_inductor_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_False_is_qat_False_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_False_is_qat_False_is_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_False_is_qat_True_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_False_is_qat_True_is_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_True_is_qat_False_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_True_is_qat_False_is_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_True_is_qat_True_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_True_is_qat_True_is_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_xpu_use_relu_False_is_qat_False_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_xpu_use_relu_True_is_qat_False_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_xpu_use_relu_True_is_qat_False_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_dequant_promotion_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_dequant_promotion_cpu_input_dim_exceeds_2, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_dequant_promotion_dynamic_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_dequant_promotion_input_dim_exceeds_2_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_dequant_promotion_int8_mixed_bf16, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_dequant_promotion_int8_mixed_bf16_input_dim_exceeds_2, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_dequant_promotion_int8_mixed_bf16_input_dim_exceeds_2_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_dequant_promotion_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_dequant_promotion_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_fp8_inductor_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_gelu_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_gelu_int8_mixed_bf16, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_gelu_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_gelu_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_input_dim_exceeds_2, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_input_dim_exceeds_2_and_not_contiguous, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_input_dim_exceeds_2_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_int8_mixed_bf16, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_int8_mixed_bf16_input_dim_exceeds_2, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_int8_mixed_bf16_input_dim_exceeds_2_and_not_contiguous, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_int8_mixed_bf16_input_dim_exceeds_2_and_not_contiguous_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_int8_mixed_bf16_input_dim_exceeds_2_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_mul, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_mul_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_mul_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_relu_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_relu_input_dim_exceeds_2, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_relu_input_dim_exceeds_2_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_relu_int8_mixed_bf16, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_relu_int8_mixed_bf16_input_dim_exceeds_2, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_relu_int8_mixed_bf16_input_dim_exceeds_2_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_relu_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_relu_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qmaxpool2d, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_reproduce_113440_issue_1, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_reproduce_113440_issue_2, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_reproduce_121253_issue_addmm_fusion_check, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_reproduce_99842_issue, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_False_bfloat16_per_channel_quant_False_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_False_bfloat16_per_channel_quant_False_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_False_bfloat16_per_channel_quant_True_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_False_bfloat16_per_channel_quant_True_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_False_float32_per_channel_quant_False_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_False_float32_per_channel_quant_False_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_False_float32_per_channel_quant_True_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_False_float32_per_channel_quant_True_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_True_bfloat16_per_channel_quant_False_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_True_bfloat16_per_channel_quant_False_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_True_bfloat16_per_channel_quant_True_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_True_bfloat16_per_channel_quant_True_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_True_float32_per_channel_quant_False_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_True_float32_per_channel_quant_False_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_True_float32_per_channel_quant_True_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_True_float32_per_channel_quant_True_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_woq_int4_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_woq_int8, test/inductor/test_mkldnn_pattern_matcher.py::TestDynamicPatternMatcher::test_linear_input_non_contiguous_3D_wo_bias_dynamic_shapes, test/inductor/test_mkldnn_pattern_matcher.py::TestDynamicPatternMatcher::test_linear_unary_dynamic_shapes, test/inductor/test_mkldnn_pattern_matcher.py::TestDynamicPatternMatcher::test_q_attention_block, test/inductor/test_mkldnn_pattern_matcher.py::TestDynamicPatternMatcher::test_qat_bn_conv2d, test/inductor/test_mkldnn_pattern_matcher.py::TestDynamicPatternMatcher::test_qconv2d_maxpool2d_linear_dynamic_cpu 2025-08-14T22:14:21.0574307Z 2025-08-14T22:14:25.2042783Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:14:25.2044236Z import pkg_resources 2025-08-14T22:14:25.5903148Z Running inductor/test_extension_backend 1/1 ... [2025-08-14 22:14:25.589748] 2025-08-14T22:14:25.5904203Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:14:25.5905893Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_extension_backend.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:14:25.590216] 2025-08-14T22:14:34.9464695Z 2025-08-14T22:14:34.9466040Z dynamo/test_dynamic_shapes 2/2 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_dynamic_shapes_2.2_40957ef1725ed7e7_.log 2025-08-14T22:14:34.9855964Z Running 944 items in this shard: test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_autocast_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_autocast_float64_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_autocast_graph_break_method_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_autocast_sdpa_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_context_wrapping_grad_mode_decorator_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_context_wrapping_set_grad_enabled_nested_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_cuda_event_method_create_stream_outside_of_compile_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_cuda_event_reconstruct_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_cuda_stream_across_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_cuda_stream_compared_with_constant_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_cuda_stream_context_manager2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_disable_saved_tensors_hooks_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_disable_saved_tensors_hooks_prev_disabled_nested_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_generic_context_manager_CustomizedCtxManager_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_generic_context_manager_with_graph_break_customized_ctx_manager_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_generic_ctx_manager_with_graph_break_CustomizedCtxManagerWithGraphBreak_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_grad_mode_guard_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_inactive_context_graph_break_local_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_inactive_context_graph_break_local_nullctx2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_inactive_context_graph_break_stack_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_is_autocast_cpu_enabled_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_nested_generic_context_manager_CustomizedCtxManager_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_nested_generic_context_manager_customized_ctx_manager_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_nested_generic_context_manager_with_graph_break_customized_ctx_manager_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_return_context_manager_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_torch_profiler_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_torch_profiler_use_after_with_block_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_add__dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_addcdiv__dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_addcdiv_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_build_list_unpack_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_call_dict1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_call_dict4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_call_dict5_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_callable_builtin_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_callable_torch_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_chunks1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_compare_constant_and_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_const_tuple_add1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_const_tuple_add2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_default_dict_constr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_default_dict_dict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_default_dict_lambda_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_default_dict_set_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_defaultdict_setdefault1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_deque_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_device_constant_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_copy_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_id_guard_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_items_sorted_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_keys_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_sorted_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_tuple_lazy_guard_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_update_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_update_kwargs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dtype_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_elipsis_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_enumerate_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_enumerate_reconstruct_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_filter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_filter_infinite_iterator_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_finfo_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_flat_param_same_storage_size_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_float_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_fn_with_self_set_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_fstrings3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_fstrings4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_fstrings5_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_fstrings6_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_funcdef_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_functools_partial_binding_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_functools_partial_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_generic_namedtuple_hasattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_get_calculate_correct_fan_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_get_privateuse1_name_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_getattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_getattr_metaclass_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_globalmodule_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_globalvar_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_in_not_in_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_index_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_indirect1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_indirect2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_inline_jit_annotations_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_inline_lru_cache_fn_with_default_args_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_inline_with_default_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_inner_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_is_contiguous_frame_counts_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_is_in_onnx_export_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_is_inference_mode_global_recompilation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_is_integer_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_is_not_null_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_is_quantized_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_is_sparse_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_itemgetter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_itertools_combinations_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_itertools_compress_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_itertools_pairwise_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_itertools_permutations_args_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_itertools_product_args_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_itertools_product_various_iterators_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_itertools_reconstruct_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_jit_annotate_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_len_constant_dict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_len_constant_list_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_len_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_add_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_add_then_mutate_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_compare_polyfill_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_compare_polyfill_non_lists_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_index_with_constant_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_setitem_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_slice_assignment_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_sorted1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_sorted2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_truth_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_listarg1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_listarg3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_listarg5_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_manual_seed_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_deque_extendleft_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_enumerate_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_infinite_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_iter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_list_extend_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_max_const_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_partial_unpack_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_reconstruct_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_return_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_sum_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_unpack_vars_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_with_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_zip_dict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_methodcall2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_module_constant_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_ndarray_methods_returning_scalar_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_ndarray_reshape_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_non_inlined_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_np_constant_collections_as_input_int_or_float_int_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_np_finfo_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_number_method_method_as_integer_ratio_num_type0_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_number_method_method_bit_length_num_type1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_number_method_method_hex_num_type5_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_numpy_random_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_numpy_size_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_obj_is_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_ordered_dict_kwargs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_as_input_UDF_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_as_input_partials_lambda_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_as_input_partials_mod_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_graph_break_reconstruct_args_and_kwargs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_graph_break_reconstruct_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_graph_break_reconstruct_mix_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_graph_break_reconstruct_mix_no_source_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___builtins___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___call___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___closure___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___defaults___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___delattr___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___dict___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___doc___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___format___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___hash___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___le___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___name___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___ne___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___new___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___repr___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___setattr___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___str___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___subclasshook___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr_func_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr_keywords_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_set_attr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_udf_kwarg_method_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_rand_tensor_partial_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_range2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_range_with_index_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_reduce_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_reduce_with_single_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_return_dict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_return_numpy_ndarray_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_return_tuple1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_round_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_set_in_frozenset_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_set_keys_view_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_set_update_bytecode_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_set_update_list_with_duplicated_items_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_shape1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_size_tuple_add_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_slice1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_slice2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_slice6_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_slice_eq_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_sliced_range_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_startswith_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_sum_shortcut_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_sum_shortcut_with_start_kwarg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_sum_with_start_kwarg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_symbool_to_int_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tensor_element_size_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tensor_is_complex_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tensor_new_with_shape_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tensor_size_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tensor_type2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tensor_type3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tensor_type_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_to_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_torch_distributions_functions_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_torch_size_as_dict_key_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_torch_size_hasattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_torch_source_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_transpose_for_scores_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tuple1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tuple_iadd_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_two_point_iter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_unary_fold_op_seq_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_unpack_ex2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_unpack_mutable_map_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_unsqueeze_inplace_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_viatorch_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_zip_longest_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_zip_reconstruct_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_312_binary_slice_with_graph_break2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_add_sizes_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_any_all_symnode_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_assigning_function_to_class_attribute_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_assigning_function_to_object_attribute_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_backward_deterministic_mode_mismatch_warning_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_bound_shape_checks_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_build_tuple_unpack_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_builtin_bool_on_symbool_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_builtin_bool_on_symfloat_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_builtin_isinstance_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_builtin_str_on_user_defined_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_builtin_subclasses_as_method_on_class_type_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cannot_trace_mark_dynamic_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cannot_trace_mark_dynamic_safe_unreached_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cast_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_catch_watchings1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cell_captured_by_existing_func_but_not_root_frame_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cell_output1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_class_duner_mro_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_closure_out_of_scope_cell_with_cond_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_closure_recompiles_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_compare_shapes_tuple_neq_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_compare_shapes_with_constant_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cond_export_single_arg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cond_side_effects_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cond_with_quantization_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_config_obj_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_const_dict_variable_python_type_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_constant_getattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cross_entropy_loss_fancy_ctor1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_custom_dict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_custom_module_free_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dataclass_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dataclass_fields_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dataclass_local_hasattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_default_args_device_dtype_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_defaultdict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_deque_append_left_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_derpy_nn_module_usage_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_descriptor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_descriptor_side_effect_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_disable_flag_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dtypes_no_graphbreaks_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dunder_new_function_inlining1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dunder_new_function_inlining4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dunder_new_function_inlining_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dynamic_sources_dynamic_override_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dynamic_sources_force_parameter_static_shapes_and_property_static_shapes_override_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dynamic_sources_precedence_over_int_specialization_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dynamic_sources_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dynamo_min_operator_with_shape_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dynamo_reset_clears_cache_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_enum_as_dict_key_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_enum_method_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_existing_func_that_creates_capturing_nested_func_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_float_speculation_log_divergence_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_fold_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_frozen_dataclass_attr_access_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_frozen_dict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_frozenset_of_non_literals_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_frozenset_torch_func_contains_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_function_annotation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_function_generic_alias_annotation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_generate_tensor_from_list_of_numpy_primitive_type_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_get_attr_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_get_custom_tensor_attribute_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_get_instruction_source_311_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_getattr_dict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_getattrvariable_as_python_constant_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_grad_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_grad_non_none_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_grad_state_mutated_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_failure_fn_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_failure_fn_tensor_iter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_filter_fn_by_is_global_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_filter_inbuilt_nn_modules_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_filter_nn_modules_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_filter_tensors_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_function_builder_with_cse_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_size_oblivious_simplification_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_sym_node_fstring_when_used_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guards_cse_pass_single_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_hasattr_nn_module_guard_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_hash_getitem_slice_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_id_guarded_class_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_id_guarded_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_id_guarded_object_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_id_of_nn_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_if_cond_nn_mod2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_if_cond_user_defined_object2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_if_cond_user_defined_object3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_if_cond_user_defined_object_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inline_dict_function_passed_as_arg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inline_module_attr_dict_clear_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inline_user_defined_dict_attr_clear_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inplace_desugaring_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inplace_param_update_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_input_cell_mutation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inspect_signature_bind_non_user_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inspect_signature_parameters_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_int_int_comparisons_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_int_list_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_int_neg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_int_shape_binops_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_is_compiling_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_is_floating_point2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_is_tensor2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_is_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_item_changes_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_item_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_itertools_accumulate_tensors_builtins_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_itertools_accumulate_tensors_user_defined_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_itertools_groupby_pure_python_key_func_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_itertools_infinite_cycle_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_itertools_infinite_repeat_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_itertools_islice_default_end_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_list_class_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_list_hasattr1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_list_hasattr2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_list_iterator_contains_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_list_mul_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_mandelbrot_numpy_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_map_side_effects_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_mark_static_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_mark_unbacked_strict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_matmul1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_min_max_over_iterable_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_mro_type_tensor_no_source_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_named_parameters_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_namedtuple1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_namedtuple3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nan_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_ne_operator_with_custom_graphbreak_eq_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nested_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nested_sequential_try_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nested_sequential_try_with_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nested_sequential_with_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nesteduserfunction_setattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_new_with_int_list_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_newly_constructed_tensor_attr_mutation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nn_module_getattribute_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nn_sequential_invocation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_no_error_on_nested_fx_trace_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_no_raise_guard_partial_constraint_across_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_non_pt2_compliant_ops_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_not_dynamic_scope_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numel_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_array_of_arrays_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_int_constant_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_min_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_ndarray_graph_break_with_multiple_outputs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_ndarray_works_with_builtin_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_no_raise_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_size_attr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_subdtype_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_ufunc_out_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_ufunc_out_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_unique_f16_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_object_setattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_ordered_dict_move_to_end_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_out_variant_custom_op_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_overridden_getattribute_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_patched_builtin_functions_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_precompile_entry_hit_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_precompile_entry_miss_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_proxy_frozen_dataclass_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_pt2_compliant_ops_are_allowed_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_pytree_tree_flatten_unflatten_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_pytree_tree_leaves_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_pytree_tree_map_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_raise_guard_indirect_full_constraint_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_raises_importerror1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_raises_importerror2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_range_iter_guards_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_reconstruct_set_across_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_recursive_tensor_attribute_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_release_module_memory_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_release_scope_memory_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_repeat_interleave_graphbreaks_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_return_dict_with_graph_break_and_update_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_return_nested_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_returning_nested_func_with_captured_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_running_nested_func_with_captured_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_runtime_assert_replacement_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_scalar_tensor_is_equivalent_to_int_list_argument_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_sequential_module_free_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_set_custom_tensor_attribute_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_set_discard_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_setattr_mutation1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_shape_env_equal_constructor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_shape_env_equal_create_symbolic_sizes_strides_storage_offset_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_shape_env_equal_unbacked_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_shape_env_recorded_function_fallback_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_shape_int_comparisons_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_shape_type_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_shape_unpack_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_size_input_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_slice_input_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_source_non_input_grad_access_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_str_format_assert1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_str_format_return1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_stride_dim_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_structseq1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_structseq2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_super_after_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_sym_constrain_range_on_replaced_unbacked_symbol_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_symint_as_device_kwarg_multi_gpu_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_symint_copy_into_unbacked_slice_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_symint_fold_nontrivial_product_modulo_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tagging_tensors_simple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_build_list_unpack_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_ctor_list_of_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_dot_grad_no_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_hasattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_item_capture_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_item_no_capture_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_iter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_layout_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_setattr_getset_descriptor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_types_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_thread_local_setattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tolist_0d_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tolist_1d_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tolist_float_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_top_package_import_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_check_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_check_is_size_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_check_symbolic_shape_rel_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_distributions_lazy_property_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_dtype_python_type_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_dynamo_codegen_pow_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_generator_set_state_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_objects_as_keys_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_size_numel_dynamic_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_trace_ndarray_frame_2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_trace_ndarray_frame_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tracing_nested_py_tree_dicts_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tracing_nested_py_tree_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tracing_nested_py_tree_mixed_all_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tracing_py_tree_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tracing_tree_map_only_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tuple_iadd_with_shape_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tuple_mul_with_shape_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_typing_typevar_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_unbacked_empty_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_unbacked_repeat_cat_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_unbacked_strict_mode_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_unhandled_exception_in_dynamo2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_unique_consecutive_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_unpack4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_update_locals_and_stack_uses_shared_cache_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_user_defined_class_name_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_user_defined_iter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_user_defined_object_class_interaction_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_user_defined_setattr2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_user_function_variable_supports_enum_argument_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_user_property_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_usr_cls_staticmethod_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_validate_outputs_unbacked_by_custom_op_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_validate_outputs_unbacked_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_variable_access_in_exception_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_with_builtin_type_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_writes_to_cells_across_frames1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_yield_from_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_yield_gen_and_from_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_abc_setattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_add_complex_conj_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_addr_alpha_beta_out_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_amp_foreach_fake_impl_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_ao_fake_quantize_tracing_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_aot_autograd_runtime_wrapper_prologue_profiled_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_as_strided_on_base_with_mutation_works_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_as_strided_on_existing_view_banned_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_attached_attribute_in_dir_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_autograd_function_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_avoid_dupe_specialization_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_batch_norm_act_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_bigbird_unsqueeze_inplace_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_bitwise_op_guard_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_bitwise_print_precedence_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_boxes_len_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_build_map_unpack_with_call_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_c_defined_metaclass_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_chunk_reformer_ff_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_contains_range_constprop_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_convert_boxes_to_pooler_format_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_copy_weird_strides_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_create_rand_mask_from_inputs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_dalle2_maybe_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_dataclass_in_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_deferred_runtime_asserts_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_delattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_delattr_raises_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_delattr_return_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_delete_local_error_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_delsubscr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_delsubscr_raises_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_detectron2_instances_cat_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_disabling_unpack_hooks_within_compiled_region_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_distributions_subclass_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_do_paste_mask_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_dont_aggressively_write_assert_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_dropout_inline_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_dynamic_shape_disable_duck_size_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_dynamic_shapes_double_not_equal_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_dynamic_shapes_implicit_guard_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_empty_graph_nested_calls_fullgraph_True_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_empty_out_dynamic_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_enum_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_exception_in_dynamo_handling_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_exec_import_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_flip_bad_accuracy_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_for_loop_graph_break_before_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_foreach_decomp_arg_names_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_fsdp_set_input_mutation_applied_when_input_gets_no_gradients_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_function_in_skipfiles_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_functools_wraps_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_gan_repro_trying_to_backward_through_the_graph_a_second_time_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_generator_dealloc_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_global_fn_mutation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_grad_references_cleared_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_guard_default_device_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_guard_with_tuple_mutation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_hasattr_builtin_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_hf_t5_forward_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_hf_xsoftmax_training_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_inductor_dynamic_shapes_broadcasting_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_inductor_no_recursionerror_on_for_loops_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_inference_mode_dynamic_shapes_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_inplace_unsqueeze_input_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_int_format_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_intermediate_leaf_requires_grad_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_invalid_seq_unpack_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_isinstance_dtype_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_isinstance_storage_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_issue114171_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_issue126128_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_issue134451_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_issue175_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_jit_script_defaults_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_list_index_not_found_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_list_index_tensor_unsupported_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_list_reverse_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_list_self_reference_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_longformer_chunk_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_longtensor_list_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_lru_cache_tracing_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_maml_no_item_capture_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_many_overlapping_inputs_does_not_explode_guards_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_maybe_multiply_symint_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_merge_criteria_processor_list2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_module_in_skipfiles_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_modules_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_multi_dot_import_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_nanmean_out_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_negative_shape_guard_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_nn_module_callable_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_nn_module_property_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_nn_param_freevar_codegen_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_nn_parameter_ctor_graph_breaks_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_nn_parameter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_nn_parametrize_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_no_grad_inline_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_no_tracing_into_eval_frame_ctx_manager_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_no_tracing_into_eval_frame_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_not_rewrite_assert_for_other_errors_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_numpy_not_ndarray_recompiles_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_odict_get_item_index_name_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_omegaconf_dictconfig_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_omegaconf_listconfig_contains_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_os_fspath_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_out_none_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_out_overload_non_contiguous_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_output_aliases_intermediate_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_overwriting_params_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_partially_initialized_module_property_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_partitioner_activation_memory_budget_with_unbacked_symints_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_pointless_graph_removal_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_randint_out_dynamic_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_reformer_min_chunk_len_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_reformer_sorting_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_reinplacing_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_relative_import_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_requires_grad_guards_with_grad_mode2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_restricted_list_subclass1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_restricted_list_subclass2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_restricted_list_subclass3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_return_value_duplication_scalar_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_return_weakref_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_rewrite_assert_dont_change_bytecode_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_rewrite_assert_with_non_string_msg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_rewrite_assert_without_msg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_seq_append_list_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_sigmoid_out_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_slicing_dynamic_shape_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_slicing_dynamic_shape_setitem_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_specialized_stride_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_staticmethod_allow_in_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_str_isalnum_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_string_format_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_subclass_graph_output_repro_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_super_staticmethod_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_symint_bitwise_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_isinstance_tuple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_random_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_set_data_backend_eager_func_name_func3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_set_data_mismatched_dtype_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_split_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_split_within_device_cm_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_uniform_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_threading_local_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tokenization_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_torch_compile_in_compile_frame_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_torch_tensor_ops_no_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_unbacked_arange_in_bounds_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_unpack_hooks_can_be_disabled_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_unspecialized_nn_module_with_torch_variable_attribute_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_unsqueeze_mul_strides_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_user_ctor_ctx_manager_custom_init_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_user_defined_iter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_vc_bumped_in_inference_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_vdd_duplicate_error_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_view_dtype_overload_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_weakref_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_with_on_graph_break_inst_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_with_on_graph_break_nested_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_zeros_out_dynamic_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_basicmodule1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_basicmodule2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_call_fn_with_non_const_inputs_safe_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_cfgmod_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_conv_call_forward_directly_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_fnmember_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_forward_directly_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_hasattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_inject_module_parameters_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_intarg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_iseval2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_isnonelayer_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_istraining2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_layerlist_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_lazy_module1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_lazy_module5_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_lazy_module7_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_lazy_module_bad_params_call_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_lazy_module_bad_params_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_lazy_module_kwargs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_lazy_module_no_cls_to_become_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_module_class_method_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_module_comparison_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_module_forward_has_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_module_guard_name_is_valid_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_module_property_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_moduledict_custom_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_nn_module_setattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_parameterdict_custom_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_parameterdict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_parameters3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_parameters4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_parameters5_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_seq_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_sequential_with_duplicated_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_stringmember_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_submodules2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_super2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_super_class_method_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_tensorlist_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_torch_mangled_class_name_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_unsupportedmethod_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_unsupportedmodule_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_viamodulecall_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_access_class_method_from_user_class_attr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_access_class_method_from_user_class_builtin_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_cond_free_variables_overlapping_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_cond_op_param_buffer_lifted_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_cond_raise_user_error_on_branch_args_mismatch_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_cond_raise_user_error_on_branch_return_multiple_tensors_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_cond_raise_user_error_on_branch_return_non_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_cond_raise_user_error_on_mismatch_return_tensor_meta_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_cond_raise_user_error_on_non_tensor_operands_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_cond_raise_user_error_on_unsupported_pred_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_cond_supported_pred_types_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_constraint_violation_error_messages_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dict_return_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dict_return_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dupes_and_bypass_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dupes_and_bypass_reorder_with_non_tensor_arg_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dupes_and_bypass_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dupes_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dynamic_slicing_simple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dynamo_enum_in_tuple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dynamo_list_index_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_enforce_equalities_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_cond_in_aten_symbolic_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_decomp_asserts_bad_args_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_defaults_ok_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_dynamic_control_flow_error_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_dynamic_dim_not_1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_dynamic_dim_range_constraint_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_graph_with_complex_reorder_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_graph_with_list_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_graph_with_list_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_identity_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_meta_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_meta_val_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_mismatched_out_2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_mismatched_out_2_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_mismatched_out_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_module_specify_constraints_signature_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_nn_module_stack_patched_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_no_tensor_computation_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_raise_guard_partial_constraint_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_args_and_empty_kwargs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_args_with_default_None_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_args_with_default_tuple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_builtin_op_on_assume_constant_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_cond_dynamic_shape_pred_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_cond_with_closed_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_free_function_and_class_method_multiarg_diff_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_free_function_and_class_method_multiarg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_free_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_global_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_in_unspecialized_nn_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_list_nonzero_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_list_nonzero_free_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_none_control_flow_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_none_control_flow_free_func_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_not_none_control_flow_free_func_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_not_return_const_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_tuple_nonzero_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_functools_wrapped_method_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_kwargs_and_empty_args_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_kwargs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_map_zero_sized_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_map_zero_sized_tensor_suppress_errors_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_module_layer_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_nonzero_static_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_shallow_list_copy_with_side_effects_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_shallow_list_copy_wo_side_effects_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_wrapped_fn_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_exported_graph_serialization_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_func_return_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_func_return_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_fx_pytree_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_invalid_input_global_multiple_access_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_invalid_input_nonlocal_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_list_not_contains_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_list_unpack_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_map_cond_param_buffer_lifted_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_mixed_real_and_fake_inputs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_nested_cond_op_param_buffer_lifted_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_no_tensor_computation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_no_tensor_computation_fail_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_not_functionalize_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_param_buffer_safe_from_mutation_recurse_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_param_buffer_safe_from_mutation_simple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_pre_dispatch_simple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_predispatch_with_higher_order_nested_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_preserve_fx_node_metadata_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_retracibility_dict_container_inp_out_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_strict_fake_tensor_prop_real_tensors_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_subclass_parameters_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_sum_param_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_trivial_constraint_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_uncaptured_higher_order_op_error_not_suppresed_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_untracked_inputs_in_constraints_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_zeroes_in_and_out_different_shape_on_test_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_zeroes_in_new_shape_scalar_out_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_capi_call1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_capi_call2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_capi_call3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_control_flow2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_control_flow3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_control_flow4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_control_flow5_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_dynamic_duck_size_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_dynamic_kwarg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_dynamic_order_dependence_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_dynamic_zero_inference_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_enumerate_not_break_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_indirect_unsupported1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_indirect_unsupported2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_no_graph_break_on_item_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_pop_after_resume_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_restore_range_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_restore_range_iter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_resume1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_resume4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_resume5_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_resume_freevars_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_resume_tuple_iterator_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_resume_with_no_grad2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_resume_with_no_grad3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_stack_state2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_start2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_start3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_tuple_iterator_return_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_allow_python_side_effects_utility_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_capture_global_num_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_capture_numpy_number_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_capture_tracked_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_capture_untracked_global_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_capture_untracked_nonlocal_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_capture_value_created_in_subgraph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_cond_branches_no_arguments_no_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_cond_pytree_operands_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_cond_pytree_operands_with_non_tensor_leaves_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_cond_side_effect_in_one_branches_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_cond_with_constant_pred_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_enum_arg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_error_message_sane_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_fallback_on_graph_break_complicated_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_flat_list_output_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_fn_with_kwargs_in_torch_ops_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_freevars_as_inputs_to_wrap_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_grad_source_fn_stack_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_hints_wrapper_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_hints_wrapper_no_hints_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_hints_wrapper_pytree_inputs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_hooks_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_inlined_functions_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_internal_nonlocal_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_lift_tensor_constant_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_lift_tensors_with_compound_expressions_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_map_example_value_metadata_consistent_with_eager_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_map_multi_return_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_map_pytree_return_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_map_symint_input_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_nested_tuple_output_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_register_mode_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_return_captured_var_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_return_captured_var_used_multiple_times_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_return_captured_vars_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_del_existing_attr_global_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_del_existing_attr_nonlocal_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_mutate_global_list_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_mutate_global_tensor_builtin_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_mutate_global_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_mutate_nonlocal_tensor_builtin_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_set_existing_attr_global_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_set_existing_attr_global_obj_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_set_existing_attr_nonlocal_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_set_existing_attr_nonlocal_obj_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_set_new_attr_nonlocal_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_set_new_attr_nonlocal_obj_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_symint_in_slice_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_symint_input_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_tensor_and_unbacked_symbol_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_tensor_to_list_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_vmap_source_fn_stack_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_wrap_all_kwarg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_wrap_allow_local_assign_in_body_fn_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_wrap_kwarg_default_else_branch_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_wrap_kwarg_int_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_wrap_pytree_args_not_const_symint_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_wrap_pytree_args_with_symint_constant_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_functional_call_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_functional_call_sequential_params_and_buffers_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_call_compiled_backward_fn_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_call_torch_compile_fn_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_closure_scalar_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_fn_with_kwargs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_freevar_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_has_aux_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_over_grad_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_two_tensor_all_grad_has_aux_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_two_tensor_has_aux_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_hessian_argnums_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_hessian_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jacfwd_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jacfwd_has_aux_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jacfwd_two_tensors_argnums_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jacrev_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jacrev_has_aux_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jvp_freevar_python_scalar_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jvp_freevar_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jvp_simple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jvp_two_tensors_disable_enable_disable_grad_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vjp_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_free_const_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_get_wrapped_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_kwargs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_multiple_invocation_out_dims_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_multiple_outputs_diff_dims_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_multiple_outputs_out_dims_tuple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_new_tensor_implicit_via_op_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_new_tensor_in_body_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_out_dims_None_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_over_vmap_two_inputs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_recompile_different_config_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_recompile_same_config_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_recompile_with_randomness_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_side_effects_append_input_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_two_inputs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_with_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_LSTM_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_aot_autograd_expand_mutation_functionalizes_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_aot_export_joint_simple_repro_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_aot_grad_mode_mutation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_arg_dupe_via_dynamo_recompiles_many_args_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_arg_dupe_via_dynamo_recompiles_many_args_param_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_arg_dupe_via_dynamo_recompiles_many_args_param_non_tensor_arg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_arg_dupe_via_dynamo_recompiles_many_args_param_non_tensor_arg_list_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_call_fn_with_non_const_inputs_aot_safe_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_call_fn_with_non_const_inputs_aot_unsafe_control_flow_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_call_fn_with_non_const_inputs_aot_unsafe_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_data_ptr_access_fails_in_forward_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_different_inputs_overlapping_set_with_mutation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_donated_buffer2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_donated_buffer3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_donated_buffer5_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_donated_buffer6_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_donated_buffer_with_retain_or_create_graph2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_donated_buffer_with_retain_or_create_graph3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_donated_buffer_with_retain_or_create_graph4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_mutation1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_negative_testing_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_requires_grad_fake_via_dynamo_recompiles_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesTestSDPA::test_intermediate_attr_access_SDPAParams_dynamic_shapes 2025-08-14T22:14:35.0230998Z 2025-08-14T22:14:35.3249487Z 2025-08-14T22:14:35.3250513Z inductor/test_extension_backend 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_extension_backend_1.1_0c911183e2b98bf4_.log 2025-08-14T22:14:35.3251646Z Running 1 items in this shard: test/inductor/test_extension_backend.py::ExtensionBackendTests::test_open_device_registration 2025-08-14T22:14:35.3252174Z 2025-08-14T22:14:39.0441663Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:14:39.0444249Z import pkg_resources 2025-08-14T22:14:39.4105118Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:14:39.4106451Z import pkg_resources 2025-08-14T22:14:39.4241879Z Running inductor/test_perf 1/1 ... [2025-08-14 22:14:39.423779] 2025-08-14T22:14:39.4242623Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:14:39.4244957Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_perf.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:14:39.424182] 2025-08-14T22:14:39.7890763Z Running inductor/test_foreach 1/1 ... [2025-08-14 22:14:39.788506] 2025-08-14T22:14:39.7891205Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:14:39.7894447Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_foreach.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:14:39.788890] 2025-08-14T22:14:46.8034550Z 2025-08-14T22:14:46.8035464Z inductor/test_perf 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_perf_1.1_3ceddc46beaa51ca_.log 2025-08-14T22:14:46.8053787Z Running 66 items in this shard: test/inductor/test_perf.py::NumBytesMetricTests::test_cat, test/inductor/test_perf.py::NumBytesMetricTests::test_cat_pointwise, test/inductor/test_perf.py::NumBytesMetricTests::test_cat_pointwise_config_option, test/inductor/test_perf.py::NumBytesMetricTests::test_cat_pointwise_many_complex_inputs, test/inductor/test_perf.py::NumBytesMetricTests::test_cat_pointwise_many_simple_inputs, test/inductor/test_perf.py::NumBytesMetricTests::test_extern, test/inductor/test_perf.py::NumBytesMetricTests::test_index, test/inductor/test_perf.py::NumBytesMetricTests::test_pointwise, test/inductor/test_perf.py::NumBytesMetricTests::test_reduction, test/inductor/test_perf.py::FusionTests::test_create_block_mask, test/inductor/test_perf.py::FusionTests::test_double_softmax, test/inductor/test_perf.py::FusionTests::test_factory_reduction, test/inductor/test_perf.py::FusionTests::test_horizontal_reduction_outer_pointwise, test/inductor/test_perf.py::FusionTests::test_horizontal_reduction_pointwise, test/inductor/test_perf.py::FusionTests::test_horizontal_reduction_pointwise2, test/inductor/test_perf.py::FusionTests::test_horizontal_reduction_reduction, test/inductor/test_perf.py::FusionTests::test_horizontal_sum_pw_broadcast, test/inductor/test_perf.py::FusionTests::test_index_pointwise, test/inductor/test_perf.py::FusionTests::test_index_reduction, test/inductor/test_perf.py::FusionTests::test_layer_norm, test/inductor/test_perf.py::FusionTests::test_mutation_fusion, test/inductor/test_perf.py::FusionTests::test_neighbor, test/inductor/test_perf.py::FusionTests::test_norm_chain, test/inductor/test_perf.py::FusionTests::test_pointwise_multi_level_reduction, test/inductor/test_perf.py::FusionTests::test_reduction_pointwise_multi_level_reduction, test/inductor/test_perf.py::FusionTests::test_softmax_backward, test/inductor/test_perf.py::FusionTests::test_softmax_inner, test/inductor/test_perf.py::FusionTests::test_vertical_sum_pw, test/inductor/test_perf.py::SchedulerFusionTests::test_fusion_choice1, test/inductor/test_perf.py::SchedulerFusionTests::test_fusion_choice2, test/inductor/test_perf.py::SchedulerFusionTests::test_fusion_choice3, test/inductor/test_perf.py::SchedulerFusionTests::test_fusion_choice4_cpu, test/inductor/test_perf.py::TilingTests::test_tiling_simple, test/inductor/test_perf.py::TilingTests::test_tiling_three, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_cat, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_dtype, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_full_remat, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_keops, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_long_chain_add, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_partial_remat, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_relu, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_unremat_bw, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_unremat_bw2, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_with_view, test/inductor/test_perf.py::NoopTests::test_noop_cat, test/inductor/test_perf.py::NoopTests::test_noop_clones, test/inductor/test_perf.py::NoopTests::test_noop_device_conversion, test/inductor/test_perf.py::NoopTests::test_noop_dtype_conversion, test/inductor/test_perf.py::NoopTests::test_noop_int_ops, test/inductor/test_perf.py::NoopTests::test_noop_slice_scatter, test/inductor/test_perf.py::InplacingTests::test_inplace_custom_op, test/inductor/test_perf.py::InplacingTests::test_inplace_custom_op_intermediate, test/inductor/test_perf.py::InplacingTests::test_inplace_custom_op_training, test/inductor/test_perf.py::InplacingTests::test_inplace_custom_op_training_two_mutated_inputs, test/inductor/test_perf.py::InplacingTests::test_inplace_custom_op_two_mutated_inputs, test/inductor/test_perf.py::InplacingTests::test_inplace_randperm_scatter, test/inductor/test_perf.py::InplacingTests::test_inplace_scatter, test/inductor/test_perf.py::InplacingTests::test_inplace_scatter_noop_view, test/inductor/test_perf.py::InplacingTests::test_inplace_triton_kernel_training, test/inductor/test_perf.py::InplacingTests::test_inplace_triton_kernel_v1, test/inductor/test_perf.py::InplacingTests::test_inplace_triton_kernel_v2, test/inductor/test_perf.py::InplacingTests::test_inplace_triton_kernel_v3, test/inductor/test_perf.py::InplacingTests::test_inplace_triton_kernel_v4, test/inductor/test_perf.py::InplacingTests::test_inplace_triton_kernel_v5, test/inductor/test_perf.py::InplacingTests::test_inplace_triton_kernel_v6, test/inductor/test_perf.py::InplacingTests::test_triton_kernel_not_fusable_with_users 2025-08-14T22:14:46.8070973Z 2025-08-14T22:14:49.9259797Z 2025-08-14T22:14:49.9260575Z inductor/test_foreach 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_foreach_1.1_0804f39aba0fa162_.log 2025-08-14T22:14:49.9436718Z Running 534 items in this shard: test/inductor/test_foreach.py::ForeachTests::test_2d_block_mixed_sizes_with_mask, test/inductor/test_foreach.py::ForeachTests::test_2d_block_no_mixed_sizes_no_mask, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_elems_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_2d_blocking_partitioning_mixed_sizes_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_aliasing, test/inductor/test_foreach.py::ForeachTests::test_broadcasting__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_broadcasting__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_broadcasting__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_broadcasting__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_broadcasting__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_broadcasting__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_broadcasting__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_broadcasting__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_broadcasting_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_broadcasting_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_broadcasting_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_broadcasting_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_broadcasting_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_broadcasting_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_broadcasting_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_broadcasting_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_cpu_cpp_fallback_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_decomp__foreach_addcdiv, test/inductor/test_foreach.py::ForeachTests::test_decomp__foreach_addcmul, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_dynamic_shapes_fallback_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_enable_dynamic_shapes_cpp_wrapper_cuda, test/inductor/test_foreach.py::ForeachTests::test_enable_dynamic_shapes_python_wrapper, test/inductor/test_foreach.py::ForeachTests::test_foreach_cpp_wrapper_cuda, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_binary_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_binary_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_binary_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_binary_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_binary_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_binary_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_binary_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_binary_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_binary_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_binary_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_unary_foreach_map_abs, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_unary_foreach_map_neg, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_unary_foreach_map_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_backward_unary_foreach_map_sign, test/inductor/test_foreach.py::ForeachTests::test_foreach_map_input_mutation, test/inductor/test_foreach.py::ForeachTests::test_fuse_concat, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_fusion_duplicate_buffer_list_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_list_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_kernel_split_arg_limit_scalar_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_multi_device, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_abs, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_neg, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_rsqrt, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_sign, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_sqrt, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_abs, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_addcmul_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_neg, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_recipaddmul_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_sign, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_list_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_abs, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_neg, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_rsqrt, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_sign, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_sqrt, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_abs, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_addcmul_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_neg, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_recipaddmul_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_sign, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_list_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_producer_scalar_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_consumer_scalar_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_abs, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_neg, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_rsqrt, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_sign, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_sqrt, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_abs, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_addcmul_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_neg, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_recipaddmul_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_sign, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_list_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_non_foreach_producer_scalar_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_reinplacing__foreach_add_, test/inductor/test_foreach.py::ForeachTests::test_reinplacing__foreach_div_, test/inductor/test_foreach.py::ForeachTests::test_reinplacing__foreach_mul_, test/inductor/test_foreach.py::ForeachTests::test_reinplacing__foreach_sub_, test/inductor/test_foreach.py::ForeachTests::test_reinplacing_mut_after__foreach_add_, test/inductor/test_foreach.py::ForeachTests::test_reinplacing_mut_after__foreach_div_, test/inductor/test_foreach.py::ForeachTests::test_reinplacing_mut_after__foreach_mul_, test/inductor/test_foreach.py::ForeachTests::test_reinplacing_mut_after__foreach_sub_, test/inductor/test_foreach.py::ForeachTests::test_reinplacing_mut_before__foreach_add_, test/inductor/test_foreach.py::ForeachTests::test_reinplacing_mut_before__foreach_div_, test/inductor/test_foreach.py::ForeachTests::test_reinplacing_mut_before__foreach_mul_, test/inductor/test_foreach.py::ForeachTests::test_reinplacing_mut_before__foreach_sub_, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_abs, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_neg, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_rsqrt, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_sign, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_sqrt, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_abs, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_addcmul_op, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_neg, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_recipaddmul_op, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_sign, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_list_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_scheduler_fusion_scalar_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_abs, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_neg, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_rsqrt, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_sign, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_sqrt, test/inductor/test_foreach.py::ForeachTests::test_single_list__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_abs, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_addcmul_op, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_neg, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_recipaddmul_op, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_sign, test/inductor/test_foreach.py::ForeachTests::test_single_list_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_single_scalar__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_single_scalar__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_single_scalar__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_single_scalar__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_single_scalar__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_single_scalar__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_single_scalar__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_single_scalar__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_single_scalar_tensor_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_abs, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_neg, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_rsqrt, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_sign, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_sqrt, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_abs, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_addcmul_op, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_neg, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_recipaddmul_op, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_reciprocal, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_sign, test/inductor/test_foreach.py::ForeachTests::test_singleton_lists_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_type_promotion__foreach_add, test/inductor/test_foreach.py::ForeachTests::test_type_promotion__foreach_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_type_promotion__foreach_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_type_promotion__foreach_copy, test/inductor/test_foreach.py::ForeachTests::test_type_promotion__foreach_div, test/inductor/test_foreach.py::ForeachTests::test_type_promotion__foreach_maximum, test/inductor/test_foreach.py::ForeachTests::test_type_promotion__foreach_minimum, test/inductor/test_foreach.py::ForeachTests::test_type_promotion__foreach_mul, test/inductor/test_foreach.py::ForeachTests::test_type_promotion__foreach_sub, test/inductor/test_foreach.py::ForeachTests::test_type_promotion_foreach_map_add, test/inductor/test_foreach.py::ForeachTests::test_type_promotion_foreach_map_add_op, test/inductor/test_foreach.py::ForeachTests::test_type_promotion_foreach_map_addrecip_op, test/inductor/test_foreach.py::ForeachTests::test_type_promotion_foreach_map_clamp_max, test/inductor/test_foreach.py::ForeachTests::test_type_promotion_foreach_map_clamp_min, test/inductor/test_foreach.py::ForeachTests::test_type_promotion_foreach_map_copy, test/inductor/test_foreach.py::ForeachTests::test_type_promotion_foreach_map_div, test/inductor/test_foreach.py::ForeachTests::test_type_promotion_foreach_map_maximum, test/inductor/test_foreach.py::ForeachTests::test_type_promotion_foreach_map_minimum, test/inductor/test_foreach.py::ForeachTests::test_type_promotion_foreach_map_mul, test/inductor/test_foreach.py::ForeachTests::test_type_promotion_foreach_map_sub, test/inductor/test_foreach.py::ForeachTests::test_zero_elems 2025-08-14T22:14:49.9605936Z 2025-08-14T22:14:50.9742068Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:14:50.9743436Z import pkg_resources 2025-08-14T22:14:51.3573296Z Running dynamo/test_modules 1/1 ... [2025-08-14 22:14:51.356775] 2025-08-14T22:14:51.3573942Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:14:51.3575867Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_modules.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:14:51.357160] 2025-08-14T22:14:54.0559506Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:14:54.0560913Z import pkg_resources 2025-08-14T22:14:54.4470174Z Running inductor/test_triton_wrapper 1/1 ... [2025-08-14 22:14:54.446464] 2025-08-14T22:14:54.4470633Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:14:54.4472298Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_triton_wrapper.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:14:54.446908] 2025-08-14T22:14:58.9873429Z 2025-08-14T22:14:58.9874200Z dynamo/test_modules 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_modules_1.1_4817c0227736a0db_.log 2025-08-14T22:14:58.9908476Z Running 134 items in this shard: test/dynamo/test_modules.py::NNModuleTests::test_access_by_keys, test/dynamo/test_modules.py::NNModuleTests::test_basicmodule1, test/dynamo/test_modules.py::NNModuleTests::test_basicmodule2, test/dynamo/test_modules.py::NNModuleTests::test_call_fn_with_non_const_inputs_safe, test/dynamo/test_modules.py::NNModuleTests::test_cfgmod, test/dynamo/test_modules.py::NNModuleTests::test_children, test/dynamo/test_modules.py::NNModuleTests::test_constloop, test/dynamo/test_modules.py::NNModuleTests::test_conv_call_forward_directly, test/dynamo/test_modules.py::NNModuleTests::test_conv_call_super_forward_directly, test/dynamo/test_modules.py::NNModuleTests::test_conv_transpose_call_forward_directly, test/dynamo/test_modules.py::NNModuleTests::test_conv_transpose_call_super_forward_directly, test/dynamo/test_modules.py::NNModuleTests::test_densenet, test/dynamo/test_modules.py::NNModuleTests::test_enumvalues, test/dynamo/test_modules.py::NNModuleTests::test_fnmember, test/dynamo/test_modules.py::NNModuleTests::test_fnmembercmp1, test/dynamo/test_modules.py::NNModuleTests::test_fnmembercmp2, test/dynamo/test_modules.py::NNModuleTests::test_forward_directly, test/dynamo/test_modules.py::NNModuleTests::test_generation_tag, test/dynamo/test_modules.py::NNModuleTests::test_hasattr, test/dynamo/test_modules.py::NNModuleTests::test_inject_module_parameters, test/dynamo/test_modules.py::NNModuleTests::test_intarg, test/dynamo/test_modules.py::NNModuleTests::test_iseval1, test/dynamo/test_modules.py::NNModuleTests::test_iseval2, test/dynamo/test_modules.py::NNModuleTests::test_isnonelayer, test/dynamo/test_modules.py::NNModuleTests::test_istraining1, test/dynamo/test_modules.py::NNModuleTests::test_istraining2, test/dynamo/test_modules.py::NNModuleTests::test_layerlist, test/dynamo/test_modules.py::NNModuleTests::test_lazy_module1, test/dynamo/test_modules.py::NNModuleTests::test_lazy_module2, test/dynamo/test_modules.py::NNModuleTests::test_lazy_module4, test/dynamo/test_modules.py::NNModuleTests::test_lazy_module5, test/dynamo/test_modules.py::NNModuleTests::test_lazy_module6, test/dynamo/test_modules.py::NNModuleTests::test_lazy_module7, test/dynamo/test_modules.py::NNModuleTests::test_lazy_module_bad_params, test/dynamo/test_modules.py::NNModuleTests::test_lazy_module_bad_params_call_function, test/dynamo/test_modules.py::NNModuleTests::test_lazy_module_kwargs, test/dynamo/test_modules.py::NNModuleTests::test_lazy_module_no_cls_to_become, test/dynamo/test_modules.py::NNModuleTests::test_lazy_module_speculation_log_divergence, test/dynamo/test_modules.py::NNModuleTests::test_module_attribute_precedence, test/dynamo/test_modules.py::NNModuleTests::test_module_call_module_with_static_forward, test/dynamo/test_modules.py::NNModuleTests::test_module_class_method, test/dynamo/test_modules.py::NNModuleTests::test_module_comparison, test/dynamo/test_modules.py::NNModuleTests::test_module_forward_has_graph_break, test/dynamo/test_modules.py::NNModuleTests::test_module_guard_name_is_valid, test/dynamo/test_modules.py::NNModuleTests::test_module_name_string, test/dynamo/test_modules.py::NNModuleTests::test_module_property, test/dynamo/test_modules.py::NNModuleTests::test_module_static_method, test/dynamo/test_modules.py::NNModuleTests::test_moduledict, test/dynamo/test_modules.py::NNModuleTests::test_moduledict_custom, test/dynamo/test_modules.py::NNModuleTests::test_modulelist, test/dynamo/test_modules.py::NNModuleTests::test_modulelist_custom, test/dynamo/test_modules.py::NNModuleTests::test_modulelist_nested, test/dynamo/test_modules.py::NNModuleTests::test_modulemethod1, test/dynamo/test_modules.py::NNModuleTests::test_modulemethod2, test/dynamo/test_modules.py::NNModuleTests::test_named_children, test/dynamo/test_modules.py::NNModuleTests::test_nn_module_setattr, test/dynamo/test_modules.py::NNModuleTests::test_nn_module_unspec_int_attr, test/dynamo/test_modules.py::NNModuleTests::test_nn_moduledict_contains, test/dynamo/test_modules.py::NNModuleTests::test_parameterdict, test/dynamo/test_modules.py::NNModuleTests::test_parameterdict_custom, test/dynamo/test_modules.py::NNModuleTests::test_parameters1, test/dynamo/test_modules.py::NNModuleTests::test_parameters2, test/dynamo/test_modules.py::NNModuleTests::test_parameters3, test/dynamo/test_modules.py::NNModuleTests::test_parameters4, test/dynamo/test_modules.py::NNModuleTests::test_parameters5, test/dynamo/test_modules.py::NNModuleTests::test_self_mutating1, test/dynamo/test_modules.py::NNModuleTests::test_seq, test/dynamo/test_modules.py::NNModuleTests::test_sequential_with_duplicated_module, test/dynamo/test_modules.py::NNModuleTests::test_sequential_with_duplicated_module2, test/dynamo/test_modules.py::NNModuleTests::test_simple_torch_function, test/dynamo/test_modules.py::NNModuleTests::test_stringmember, test/dynamo/test_modules.py::NNModuleTests::test_submodules1, test/dynamo/test_modules.py::NNModuleTests::test_submodules2, test/dynamo/test_modules.py::NNModuleTests::test_super1, test/dynamo/test_modules.py::NNModuleTests::test_super2, test/dynamo/test_modules.py::NNModuleTests::test_super_class_method, test/dynamo/test_modules.py::NNModuleTests::test_tensorlist, test/dynamo/test_modules.py::NNModuleTests::test_torch_function_with_closure, test/dynamo/test_modules.py::NNModuleTests::test_torch_mangled_class_name, test/dynamo/test_modules.py::NNModuleTests::test_unsupportedmethod, test/dynamo/test_modules.py::NNModuleTests::test_unsupportedmodule, test/dynamo/test_modules.py::NNModuleTests::test_viamodulecall, test/dynamo/test_modules.py::OptimizedModuleTest::test_assign_does_not_exist, test/dynamo/test_modules.py::OptimizedModuleTest::test_attr, test/dynamo/test_modules.py::OptimizedModuleTest::test_attr_precedence, test/dynamo/test_modules.py::OptimizedModuleTest::test_backward_hooks, test/dynamo/test_modules.py::OptimizedModuleTest::test_branch_on_nn_module_custom_bool, test/dynamo/test_modules.py::OptimizedModuleTest::test_branch_on_nn_module_custom_len, test/dynamo/test_modules.py::OptimizedModuleTest::test_buffer_order, test/dynamo/test_modules.py::OptimizedModuleTest::test_composition, test/dynamo/test_modules.py::OptimizedModuleTest::test_composition_with_opt_mod, test/dynamo/test_modules.py::OptimizedModuleTest::test_delattr_on_compiled_module, test/dynamo/test_modules.py::OptimizedModuleTest::test_dir, test/dynamo/test_modules.py::OptimizedModuleTest::test_dunder_call_explicitly, test/dynamo/test_modules.py::OptimizedModuleTest::test_globals_change_in_other_file, test/dynamo/test_modules.py::OptimizedModuleTest::test_guard_on_torch_nn_modules, test/dynamo/test_modules.py::OptimizedModuleTest::test_hooks_allowed_modules, test/dynamo/test_modules.py::OptimizedModuleTest::test_hooks_allowed_modules_compiles, test/dynamo/test_modules.py::OptimizedModuleTest::test_hooks_allowed_modules_compiles_self_contained, test/dynamo/test_modules.py::OptimizedModuleTest::test_hooks_inner, test/dynamo/test_modules.py::OptimizedModuleTest::test_hooks_outer, test/dynamo/test_modules.py::OptimizedModuleTest::test_hooks_skip_guards, test/dynamo/test_modules.py::OptimizedModuleTest::test_inline_inbuilt_nn_modules, test/dynamo/test_modules.py::OptimizedModuleTest::test_mark_static_nn_module_tensor, test/dynamo/test_modules.py::OptimizedModuleTest::test_mark_static_previously_seen_tensor, test/dynamo/test_modules.py::OptimizedModuleTest::test_mark_static_with_freezing, test/dynamo/test_modules.py::OptimizedModuleTest::test_module_dict_iter_keys, test/dynamo/test_modules.py::OptimizedModuleTest::test_module_dict_iter_name, test/dynamo/test_modules.py::OptimizedModuleTest::test_module_dict_iter_values, test/dynamo/test_modules.py::OptimizedModuleTest::test_module_order, test/dynamo/test_modules.py::OptimizedModuleTest::test_module_patch, test/dynamo/test_modules.py::OptimizedModuleTest::test_module_setattr, test/dynamo/test_modules.py::OptimizedModuleTest::test_monkeypatching_forward, test/dynamo/test_modules.py::OptimizedModuleTest::test_nn_module, test/dynamo/test_modules.py::OptimizedModuleTest::test_no_op_assignment, test/dynamo/test_modules.py::OptimizedModuleTest::test_no_recompile_on_nn_guarded_modules, test/dynamo/test_modules.py::OptimizedModuleTest::test_overridden_call, test/dynamo/test_modules.py::OptimizedModuleTest::test_param_order, test/dynamo/test_modules.py::OptimizedModuleTest::test_param_requires_grad, test/dynamo/test_modules.py::OptimizedModuleTest::test_patch_module, test/dynamo/test_modules.py::OptimizedModuleTest::test_recompile_limit_on_freed_module, test/dynamo/test_modules.py::OptimizedModuleTest::test_recompile_limit_on_guarded_nn_modules, test/dynamo/test_modules.py::OptimizedModuleTest::test_recursion, test/dynamo/test_modules.py::OptimizedModuleTest::test_save_and_load_all_backends, test/dynamo/test_modules.py::OptimizedModuleTest::test_save_and_load_inductor, test/dynamo/test_modules.py::OptimizedModuleTest::test_setattr_on_compiled_module, test/dynamo/test_modules.py::OptimizedModuleTest::test_to, test/dynamo/test_modules.py::OptimizedModuleTest::test_trace_delattr, test/dynamo/test_modules.py::OptimizedModuleTest::test_udo_instance_method_as_hook, test/dynamo/test_modules.py::OptimizedModuleTest::test_unhashable_nn_submodule, test/dynamo/test_modules.py::OptimizedModuleTest::test_unspec_non_inlinable_module, test/dynamo/test_modules.py::OptimizedModuleTest::test_unspecialized_seq, test/dynamo/test_modules.py::OptimizedModuleTest::test_user_defined_nn_module_dynamic, test/dynamo/test_modules.py::NNModuleTestsDeviceCUDA::test_lazy_module3_cuda 2025-08-14T22:14:58.9942013Z 2025-08-14T22:15:01.7267925Z 2025-08-14T22:15:01.7268876Z inductor/test_triton_wrapper 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_triton_wrapper_1.1_c30d1c70a30ca772_.log 2025-08-14T22:15:01.7269915Z Running 1 items in this shard: test/inductor/test_triton_wrapper.py::TestTritonWrapper::test_wrapper_using_gpu_seed 2025-08-14T22:15:01.7270376Z 2025-08-14T22:15:03.1163797Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:15:03.1165455Z import pkg_resources 2025-08-14T22:15:03.5136260Z Running inductor/test_coordinate_descent_tuner 1/1 ... [2025-08-14 22:15:03.513006] 2025-08-14T22:15:03.5137299Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:15:03.5138552Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_coordinate_descent_tuner.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:15:03.513382] 2025-08-14T22:15:05.8549749Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:15:05.8551091Z import pkg_resources 2025-08-14T22:15:06.2320514Z Running inductor/test_select_algorithm 1/1 ... [2025-08-14 22:15:06.231499] 2025-08-14T22:15:06.2320945Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:15:06.2324814Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_select_algorithm.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:15:06.231952] 2025-08-14T22:15:10.6926808Z 2025-08-14T22:15:10.6928071Z inductor/test_coordinate_descent_tuner 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_coordinate_descent_tuner_1.1_aeadd411e4ddfc23_.log 2025-08-14T22:15:10.6930734Z Running 5 items in this shard: test/inductor/test_coordinate_descent_tuner.py::TestCoordinateDescentTuner::test_abs_function, test/inductor/test_coordinate_descent_tuner.py::TestCoordinateDescentTuner::test_get_neighbour_values, test/inductor/test_coordinate_descent_tuner.py::TestCoordinateDescentTuner::test_no_neighbors, test/inductor/test_coordinate_descent_tuner.py::TestCoordinateDescentTuner::test_persistent_reduction, test/inductor/test_coordinate_descent_tuner.py::TestCoordinateDescentTuner::test_value_too_large 2025-08-14T22:15:10.6933042Z 2025-08-14T22:15:13.5114462Z 2025-08-14T22:15:13.5115496Z inductor/test_select_algorithm 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_select_algorithm_1.1_c5ea25282a63e0a3_.log 2025-08-14T22:15:13.5122254Z Running 21 items in this shard: test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_TritonTemplateCaller_str, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test__int_mm, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_addmm, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_addmm_fp16, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_baddbmm, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_bmm, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_convolution1, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_convolution2, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_convolution_as_mm, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_linear_relu, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_mm, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_mm_dropout, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_mm_dup_args, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_mm_dup_args_view, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_mm_not_even_k, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_mm_plus_mm, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_mm_plus_mm2, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_mm_plus_mm3, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_mm_skip, test/inductor/test_select_algorithm.py::TestSelectAlgorithm::test_preprocessing_single_choice, test/inductor/test_select_algorithm.py::TestTemplateRender::test_finalized_subclass_hooks 2025-08-14T22:15:13.5129200Z 2025-08-14T22:15:14.8554970Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:15:14.8556360Z import pkg_resources 2025-08-14T22:15:15.2675496Z Running inductor/test_dependencies 1/1 ... [2025-08-14 22:15:15.266952] 2025-08-14T22:15:15.2675948Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:15:15.2677072Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_dependencies.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:15:15.267356] 2025-08-14T22:15:17.6509777Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:15:17.6511127Z import pkg_resources 2025-08-14T22:15:18.0318751Z Running inductor/test_compiled_autograd 1/2 ... [2025-08-14 22:15:18.031378] 2025-08-14T22:15:18.0319230Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:15:18.0321684Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_compiled_autograd.py', '-m', 'not serial', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:15:18.031824] 2025-08-14T22:15:22.3965599Z 2025-08-14T22:15:22.3966779Z inductor/test_dependencies 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_dependencies_1.1_66551d8c8b89fd5c_.log 2025-08-14T22:15:22.3969108Z Running 5 items in this shard: test/inductor/test_dependencies.py::TestDependencies::test_bucketize_dependencies_no_sorter, test/inductor/test_dependencies.py::TestDependencies::test_bucketize_dependencies_sorter, test/inductor/test_dependencies.py::TestDependencies::test_get_offset, test/inductor/test_dependencies.py::TestDependencies::test_normalize_with_stride_order_equal, test/inductor/test_dependencies.py::TestDependencies::test_normalize_with_stride_order_unequal 2025-08-14T22:15:22.3970880Z 2025-08-14T22:15:26.4967875Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:15:26.4970128Z import pkg_resources 2025-08-14T22:15:26.8762693Z Running inductor/test_halide 1/1 ... [2025-08-14 22:15:26.875720] 2025-08-14T22:15:26.8763397Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:15:26.8765384Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_halide.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:15:26.876135] 2025-08-14T22:15:34.3554242Z 2025-08-14T22:15:34.3555251Z inductor/test_halide 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_halide_1.1_95b2b4243c683e1d_.log 2025-08-14T22:15:34.3555837Z 2025-08-14T22:15:38.4348295Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:15:38.4350938Z import pkg_resources 2025-08-14T22:15:38.8245402Z Running inductor/test_triton_extension_backend 1/1 ... [2025-08-14 22:15:38.823949] 2025-08-14T22:15:38.8245935Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:15:38.8247536Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_triton_extension_backend.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:15:38.824346] 2025-08-14T22:15:46.4546232Z 2025-08-14T22:15:46.4547720Z inductor/test_triton_extension_backend 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_triton_extension_backend_1.1_845d01773facf697_.log 2025-08-14T22:15:46.4548579Z Running 0 items in this shard: 2025-08-14T22:15:46.4548759Z 2025-08-14T22:15:50.6047350Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:15:50.6048877Z import pkg_resources 2025-08-14T22:15:50.9918462Z Running inductor/test_autoheuristic 1/1 ... [2025-08-14 22:15:50.991331] 2025-08-14T22:15:50.9918922Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:15:50.9921631Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_autoheuristic.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:15:50.991794] 2025-08-14T22:15:58.2699272Z 2025-08-14T22:15:58.2700517Z inductor/test_autoheuristic 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_autoheuristic_1.1_3f14a2c1f52c5c25_.log 2025-08-14T22:15:58.2701279Z Running 0 items in this shard: 2025-08-14T22:15:58.2701458Z 2025-08-14T22:16:02.3723177Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:16:02.3724622Z import pkg_resources 2025-08-14T22:16:02.7546255Z Running inductor/test_mps_basic 1/1 ... [2025-08-14 22:16:02.754081] 2025-08-14T22:16:02.7546861Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:16:02.7548597Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_mps_basic.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:16:02.754507] 2025-08-14T22:16:10.2831560Z 2025-08-14T22:16:10.2832463Z inductor/test_mps_basic 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_mps_basic_1.1_25f4b06debb1e5f4_.log 2025-08-14T22:16:10.2833324Z 2025-08-14T22:16:14.4226385Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:16:14.4227757Z import pkg_resources 2025-08-14T22:16:14.8053202Z Running dynamo/test_precompile_context 1/1 ... [2025-08-14 22:16:14.804860] 2025-08-14T22:16:14.8053672Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:16:14.8056161Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_precompile_context.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:16:14.805243] 2025-08-14T22:16:21.9361261Z 2025-08-14T22:16:21.9362304Z dynamo/test_precompile_context 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_precompile_context_1.1_87ab36be47245142_.log 2025-08-14T22:16:21.9364208Z Running 3 items in this shard: test/dynamo/test_precompile_context.py::PrecompileContextTests::test_basic, test/dynamo/test_precompile_context.py::PrecompileContextTests::test_editable, test/dynamo/test_precompile_context.py::PrecompileContextTests::test_serialize_by_key 2025-08-14T22:16:21.9365369Z 2025-08-14T22:16:26.0522893Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:16:26.0525754Z import pkg_resources 2025-08-14T22:16:26.4349190Z Running dynamo/test_cudagraphs_expandable_segments 1/1 ... [2025-08-14 22:16:26.434425] 2025-08-14T22:16:26.4349708Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:16:26.4352263Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_cudagraphs_expandable_segments.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:16:26.434870] 2025-08-14T22:16:31.1083369Z 2025-08-14T22:16:31.1084686Z dynamo/test_cudagraphs_expandable_segments 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_cudagraphs_expandable_segments_1.1_c9e3e333f5234bed_.log 2025-08-14T22:16:31.1088456Z Running 8 items in this shard: test/dynamo/test_cudagraphs_expandable_segments.py::TestAotCudagraphs::test_basic, test/dynamo/test_cudagraphs_expandable_segments.py::TestAotCudagraphs::test_dead_fill, test/dynamo/test_cudagraphs_expandable_segments.py::TestAotCudagraphs::test_dtoh, test/dynamo/test_cudagraphs_expandable_segments.py::TestAotCudagraphs::test_factory, test/dynamo/test_cudagraphs_expandable_segments.py::TestAotCudagraphs::test_htod, test/dynamo/test_cudagraphs_expandable_segments.py::TestAotCudagraphs::test_mutate_constant, test/dynamo/test_cudagraphs_expandable_segments.py::TestAotCudagraphs::test_mutate_input, test/dynamo/test_cudagraphs_expandable_segments.py::TestAotCudagraphs::test_mutated_metadata 2025-08-14T22:16:31.1091130Z 2025-08-14T22:16:35.1903689Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:16:35.1905896Z import pkg_resources 2025-08-14T22:16:35.5721186Z Running dynamo/test_guard_manager 1/1 ... [2025-08-14 22:16:35.571589] 2025-08-14T22:16:35.5721885Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:16:35.5723528Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_guard_manager.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:16:35.572044] 2025-08-14T22:16:40.7463512Z 2025-08-14T22:16:40.7464444Z dynamo/test_guard_manager 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_guard_manager_1.1_20c705b7bd535116_.log 2025-08-14T22:16:40.7475562Z Running 36 items in this shard: test/dynamo/test_guard_manager.py::GuardManagerTests::test_attr_guard_manager, test/dynamo/test_guard_manager.py::GuardManagerTests::test_call_function_no_args_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_clone, test/dynamo/test_guard_manager.py::GuardManagerTests::test_default_device_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_dict_contains_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_dict_getitem_accessor, test/dynamo/test_guard_manager.py::GuardManagerTests::test_dict_guard_manager, test/dynamo/test_guard_manager.py::GuardManagerTests::test_dict_version_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_diff_guard_manager, test/dynamo/test_guard_manager.py::GuardManagerTests::test_dynamic_indices_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_equals_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_framelocals_accessor, test/dynamo/test_guard_manager.py::GuardManagerTests::test_framelocals_guard_e2e, test/dynamo/test_guard_manager.py::GuardManagerTests::test_global_state_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_global_state_reason, test/dynamo/test_guard_manager.py::GuardManagerTests::test_global_weakref, test/dynamo/test_guard_manager.py::GuardManagerTests::test_globals, test/dynamo/test_guard_manager.py::GuardManagerTests::test_guard_manager_leaf_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_id_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_item_guard_manager, test/dynamo/test_guard_manager.py::GuardManagerTests::test_lambda_manager, test/dynamo/test_guard_manager.py::GuardManagerTests::test_length_check_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_no_hasattr_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_no_tensor_aliasing_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_python_lambda_leaf_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_tensor_aliasing_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_tensor_match_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_tuple_iterator_getitem, test/dynamo/test_guard_manager.py::GuardManagerTests::test_type_guard, test/dynamo/test_guard_manager.py::GuardManagerTests::test_type_manager, test/dynamo/test_guard_manager.py::GuardManagerTests::test_weakref_alive_guard, test/dynamo/test_guard_manager.py::TypePropagationTests::test_basic_types, test/dynamo/test_guard_manager.py::TagSafetyChecks::test_dict_tag_safe, test/dynamo/test_guard_manager.py::TagSafetyChecks::test_immutable_tag_safe, test/dynamo/test_guard_manager.py::TagSafetyChecks::test_nn_module_tag_safe, test/dynamo/test_guard_manager.py::RecursiveDictGuardTests::test_disabling 2025-08-14T22:16:40.7485625Z 2025-08-14T22:16:44.8545432Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:16:44.8546817Z import pkg_resources 2025-08-14T22:16:45.2367159Z Running test_sparse_semi_structured 1/1 ... [2025-08-14 22:16:45.236160] 2025-08-14T22:16:45.2367752Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:16:45.2368990Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_sparse_semi_structured.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:16:45.236557] 2025-08-14T22:17:11.3547767Z 2025-08-14T22:17:11.3549000Z test_sparse_semi_structured 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_sparse_semi_structured_1.1_53126d24062a7223_.log 2025-08-14T22:17:11.3657143Z Running 218 items in this shard: test/test_sparse_semi_structured.py::SparseSemiStructuredTensorCompileTest::test_mlp_contiguous_relu_compile_cusparselt, test/test_sparse_semi_structured.py::SparseSemiStructuredTensorCompileTest::test_mlp_contiguous_relu_compile_cutlass, test/test_sparse_semi_structured.py::SparseSemiStructuredTensorCompileTest::test_sp24_compile, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_indices_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_indices_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape0_inference_mode_False_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape0_inference_mode_False_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape0_inference_mode_True_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape0_inference_mode_True_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape1_inference_mode_False_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape1_inference_mode_False_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape1_inference_mode_True_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape1_inference_mode_True_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape2_inference_mode_False_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape2_inference_mode_False_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape2_inference_mode_True_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape2_inference_mode_True_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape3_inference_mode_False_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape3_inference_mode_False_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape3_inference_mode_True_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_linear_dense_input_shape3_inference_mode_True_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_min_sparse_shape_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_min_sparse_shape_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_min_sparse_shape_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_min_sparse_shape_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_min_sparse_shape_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_min_sparse_shape_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mlp_dense_input_shape0_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mlp_dense_input_shape0_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mlp_dense_input_shape1_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mlp_dense_input_shape1_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mlp_dense_input_shape2_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mlp_dense_input_shape2_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mlp_dense_input_shape3_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mlp_dense_input_shape3_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cusparselt_dense_input_shape0_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cusparselt_dense_input_shape0_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cusparselt_dense_input_shape0_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cusparselt_dense_input_shape1_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cusparselt_dense_input_shape1_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cusparselt_dense_input_shape1_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cusparselt_dense_input_shape2_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cusparselt_dense_input_shape2_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cusparselt_dense_input_shape2_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cutlass_dense_input_shape0_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cutlass_dense_input_shape0_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cutlass_dense_input_shape0_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cutlass_dense_input_shape1_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cutlass_dense_input_shape1_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cutlass_dense_input_shape1_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cutlass_dense_input_shape2_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cutlass_dense_input_shape2_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NN_backend_cutlass_dense_input_shape2_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cusparselt_dense_input_shape0_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cusparselt_dense_input_shape0_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cusparselt_dense_input_shape0_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cusparselt_dense_input_shape1_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cusparselt_dense_input_shape1_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cusparselt_dense_input_shape1_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cusparselt_dense_input_shape2_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cusparselt_dense_input_shape2_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cusparselt_dense_input_shape2_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cutlass_dense_input_shape0_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cutlass_dense_input_shape0_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cutlass_dense_input_shape0_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cutlass_dense_input_shape1_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cutlass_dense_input_shape1_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cutlass_dense_input_shape1_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cutlass_dense_input_shape2_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cutlass_dense_input_shape2_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_NT_backend_cutlass_dense_input_shape2_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape0_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape0_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape0_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape0_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape0_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape0_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape1_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape1_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape1_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape1_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape1_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape1_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape2_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape2_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape2_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape2_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape2_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_first_TN_dense_input_shape2_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape0_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape0_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape0_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape0_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape0_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape0_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape1_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape1_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape1_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape1_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape1_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape1_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape2_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape2_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape2_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape2_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape2_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NN_dense_input_shape2_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape0_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape0_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape0_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape0_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape0_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape0_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape1_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape1_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape1_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape1_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape1_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape1_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape2_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape2_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape2_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape2_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape2_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_mm_sparse_second_NT_dense_input_shape2_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_to_sparse_semi_structured_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_to_sparse_semi_structured_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_to_sparse_semi_structured_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_to_sparse_semi_structured_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_to_sparse_semi_structured_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_to_sparse_semi_structured_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dim_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dim_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cusparselt_cuda_complex128, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cusparselt_cuda_complex64, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cusparselt_cuda_float32, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cusparselt_cuda_float64, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cusparselt_cuda_int16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cusparselt_cuda_int32, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cusparselt_cuda_int64, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cusparselt_cuda_uint8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cutlass_cuda_complex128, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cutlass_cuda_complex64, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cutlass_cuda_float32, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cutlass_cuda_float64, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cutlass_cuda_int16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cutlass_cuda_int32, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cutlass_cuda_int64, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_dtype_backend_cutlass_cuda_uint8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_shape_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_shape_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_shape_backend_cusparselt_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_shape_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_shape_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_unsupported_shape_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_values_backend_cusparselt_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUDA::test_values_backend_cutlass_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_conversions_all_patterns_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_conversions_all_patterns_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_conversions_all_patterns_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_conversions_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_conversions_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_conversions_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_linear_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_linear_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_linear_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_sparse_semi_structured_ops_cutlass_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_sparse_semi_structured_ops_cutlass_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUTLASSCUDA::test_sparse_semi_structured_ops_cutlass_backend_cutlass_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_gemm_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_gemm_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pack_both_ways_edge_case1_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pack_both_ways_edge_case1_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pack_both_ways_id_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pack_both_ways_id_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pack_both_ways_meta_correctness_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pack_both_ways_meta_correctness_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pack_both_ways_meta_correctness_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pack_both_ways_meta_correctness_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_prune_dense_static_sort_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_prune_dense_static_sort_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pruning_algo_largest_abs_values_greedy_backend_cusparselt_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pruning_algo_largest_abs_values_greedy_backend_cusparselt_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pruning_algo_largest_abs_values_greedy_backend_cutlass_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_pruning_algo_largest_abs_values_greedy_backend_cutlass_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_sp24_apply_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_sp24_apply_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_sp24_apply_dense_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_sp24_apply_dense_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_sp24_matmuls_bmm_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_sp24_matmuls_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_sp24_matmuls_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredTrainingCUDA::test_sp24_matmuls_mat_vec_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_alpha_compile_autotune_bfloat16_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_alpha_compile_autotune_float16_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_alpha_compile_autotune_int32_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_alpha_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_alpha_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_alpha_mixed_dtype_bfloat16_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_alpha_mixed_dtype_float16_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_alpha_mixed_dtype_int32_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_mixed_dtype_bfloat16_dense_input_shape0_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_mixed_dtype_float16_dense_input_shape0_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_mixed_dtype_int32_dense_input_shape0_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_search_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_search_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cslt_sparse_mm_search_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_csrc_cslt_sparse_mm_search_cuda_bfloat16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_csrc_cslt_sparse_mm_search_cuda_float16, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_csrc_cslt_sparse_mm_search_cuda_int8, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_cusparselt_backend_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_sparse_fp8fp8_mm_dense_input_shape0_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_sparse_semi_structured_scaled_mm_bfloat16_dense_input_shape0_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_sparse_semi_structured_scaled_mm_float16_dense_input_shape0_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_sparse_semi_structured_scaled_mm_float32_dense_input_shape0_cuda, test/test_sparse_semi_structured.py::TestSparseSemiStructuredCUSPARSELTCUDA::test_sparse_semi_structured_scaled_mm_fp8_cuda 2025-08-14T22:17:11.3759418Z 2025-08-14T22:17:15.5006441Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:17:15.5008407Z import pkg_resources 2025-08-14T22:17:15.8877860Z Running export/test_serialize 1/1 ... [2025-08-14 22:17:15.887151] 2025-08-14T22:17:15.8878469Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:17:15.8879991Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_serialize.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:17:15.887617] 2025-08-14T22:17:20.8123206Z 2025-08-14T22:17:20.8123947Z export/test_serialize 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_serialize_1.1_fab45c99fd420d5a_.log 2025-08-14T22:17:20.8154346Z Running 101 items in this shard: test/export/test_serialize.py::TestSerialize::test_canonicalize, test/export/test_serialize.py::TestSerialize::test_export_example_inputs_preserved, test/export/test_serialize.py::TestSerialize::test_export_with_extension_op_serialization, test/export/test_serialize.py::TestSerialize::test_int_list, test/export/test_serialize.py::TestSerialize::test_kwargs_default, test/export/test_serialize.py::TestSerialize::test_metadata_parsing_with_layer_split, test/export/test_serialize.py::TestSerialize::test_metadata_run_decomp_serder, test/export/test_serialize.py::TestSerialize::test_multi_return_some_unused, test/export/test_serialize.py::TestSerialize::test_nested_layer_split, test/export/test_serialize.py::TestSerialize::test_nonfinite_inputs, test/export/test_serialize.py::TestSerialize::test_predispatch_export_with_autograd_op, test/export/test_serialize.py::TestSerialize::test_rational_ranges, test/export/test_serialize.py::TestSerialize::test_serialize_constant_outputs, test/export/test_serialize.py::TestSerialize::test_serialize_infinite_sym_int, test/export/test_serialize.py::TestSerialize::test_serialize_list_returns, test/export/test_serialize.py::TestSerialize::test_serialize_multiple_returns_from_node, test/export/test_serialize.py::TestSerialize::test_serialize_param_mutation, test/export/test_serialize.py::TestSerialize::test_serialize_sym_float, test/export/test_serialize.py::TestSerialize::test_serialize_sym_int, test/export/test_serialize.py::TestSerialize::test_symint_list, test/export/test_serialize.py::TestDeserialize::test_arg_from, test/export/test_serialize.py::TestDeserialize::test_auto_functionalize, test/export/test_serialize.py::TestDeserialize::test_basic, test/export/test_serialize.py::TestDeserialize::test_cond, test/export/test_serialize.py::TestDeserialize::test_constraints, test/export/test_serialize.py::TestDeserialize::test_custom_obj, test/export/test_serialize.py::TestDeserialize::test_custom_obj_list_out, test/export/test_serialize.py::TestDeserialize::test_custom_obj_tuple_out, test/export/test_serialize.py::TestDeserialize::test_device, test/export/test_serialize.py::TestDeserialize::test_dynamic, test/export/test_serialize.py::TestDeserialize::test_export_no_inputs, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_assume_constant_result, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_autograd_function, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_class_method, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_cond_branch_class_method, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_cond_branch_nested_function, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_cond_branch_nonlocal_variables, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_cond_closed_over_variable, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_cond_operands, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_cond_predicate, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_constrain_as_size_example, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_constrain_as_value_example, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_decorator, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_dictionary, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_dynamic_shape_assert, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_dynamic_shape_constructor, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_dynamic_shape_if_guard, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_dynamic_shape_map, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_dynamic_shape_slicing, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_dynamic_shape_view, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_fn_with_kwargs, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_list_contains, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_list_unpack, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_nested_function, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_null_context_manager, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_pytree_flatten, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_scalar_output, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_specialized_attribute, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_static_for_loop, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_static_if, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_tensor_setattr, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_type_reflection_method, test/export/test_serialize.py::TestDeserialize::test_exportdb_supported_case_user_input_mutation, test/export/test_serialize.py::TestDeserialize::test_get_attr, test/export/test_serialize.py::TestDeserialize::test_get_attr_list, test/export/test_serialize.py::TestDeserialize::test_hoo_symint_input, test/export/test_serialize.py::TestDeserialize::test_list_of_optional_tensors, test/export/test_serialize.py::TestDeserialize::test_map, test/export/test_serialize.py::TestDeserialize::test_module, test/export/test_serialize.py::TestDeserialize::test_module_meta, test/export/test_serialize.py::TestDeserialize::test_multi_return, test/export/test_serialize.py::TestDeserialize::test_multiple_getitem, test/export/test_serialize.py::TestDeserialize::test_none_input, test/export/test_serialize.py::TestDeserialize::test_optional_tuple, test/export/test_serialize.py::TestDeserialize::test_positional_argument_with_default_value, test/export/test_serialize.py::TestDeserialize::test_pytree_namedtuple, test/export/test_serialize.py::TestDeserialize::test_serialize_float8, test/export/test_serialize.py::TestDeserialize::test_shape, test/export/test_serialize.py::TestDeserialize::test_sym_bool, test/export/test_serialize.py::TestDeserialize::test_sym_bool_dynamic_shapes, test/export/test_serialize.py::TestDeserialize::test_sym_bool_torch_check_equal, test/export/test_serialize.py::TestDeserialize::test_sym_float, test/export/test_serialize.py::TestDeserialize::test_sym_int_torch_check_equal, test/export/test_serialize.py::TestDeserialize::test_sym_ite, test/export/test_serialize.py::TestDeserialize::test_tensor_tensor_list, test/export/test_serialize.py::TestDeserialize::test_unbacked_bindings_serialize, test/export/test_serialize.py::TestSchemaVersioning::test_error, test/export/test_serialize.py::TestSaveLoad::test_save_buffer, test/export/test_serialize.py::TestSaveLoad::test_save_constants, test/export/test_serialize.py::TestSaveLoad::test_save_extra, test/export/test_serialize.py::TestSaveLoad::test_save_file, test/export/test_serialize.py::TestSaveLoad::test_save_path, test/export/test_serialize.py::TestSaveLoad::test_version_error, test/export/test_serialize.py::TestSerializeCustomClass::test_backed_size_oblivious_serdes, test/export/test_serialize.py::TestSerializeCustomClass::test_custom_class, test/export/test_serialize.py::TestSerializeCustomClass::test_custom_class_containing_fake_tensor, test/export/test_serialize.py::TestSerializeCustomClass::test_custom_class_input_to_function, test/export/test_serialize.py::TestSerializeCustomClass::test_custom_tag_metadata_copy, test/export/test_serialize.py::TestSerializeCustomClass::test_custom_tag_metadata_decomp, test/export/test_serialize.py::TestSerializeCustomClass::test_custom_tag_metadata_serialization, test/export/test_serialize.py::TestSerializeCustomClass::test_unbacked_range_serdes 2025-08-14T22:17:20.8183746Z 2025-08-14T22:17:24.9968281Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:17:24.9969641Z import pkg_resources 2025-08-14T22:17:25.3840364Z Running dynamo/test_structured_trace 1/1 ... [2025-08-14 22:17:25.383480] 2025-08-14T22:17:25.3840925Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:17:25.3842723Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_structured_trace.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:17:25.383960] 2025-08-14T22:17:32.6634050Z 2025-08-14T22:17:32.6634943Z dynamo/test_structured_trace 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_structured_trace_1.1_9a2f3ac74641179f_.log 2025-08-14T22:17:32.6643534Z Running 25 items in this shard: test/dynamo/test_structured_trace.py::StructuredTraceTest::test_chromium_event, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_codecache, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_collective_schedule_empty, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_collective_schedule_real, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_compile_id_serialization_deserialization, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_compiled_autograd_attribution, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_compiled_autograd_chromium, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_compiled_autograd_id, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_cudagraphs, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_ddp_graphs, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_dump_file, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_dynamo_error, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_example_fn, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_example_training_fn, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_graph_breaks, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_graph_sizes_dynamic, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_guards_recompiles, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_inductor_error, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_make_fx_fail_partial, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_recompile_user_contexts, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_recompile_user_contexts_iteration, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_recompiles, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_runtime_estimates_mixed, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_runtime_estimates_simple, test/dynamo/test_structured_trace.py::StructuredTraceTest::test_schedule 2025-08-14T22:17:32.6651909Z 2025-08-14T22:17:36.8167727Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:17:36.8170396Z import pkg_resources 2025-08-14T22:17:37.1952829Z Running export/test_verifier 1/1 ... [2025-08-14 22:17:37.194695] 2025-08-14T22:17:37.1953561Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:17:37.1954719Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_verifier.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:17:37.195174] 2025-08-14T22:17:41.7200400Z 2025-08-14T22:17:41.7201454Z export/test_verifier 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_verifier_1.1_5ba982a4127f2374_.log 2025-08-14T22:17:41.7204907Z Running 10 items in this shard: test/export/test_verifier.py::TestVerifier::test_ep_verifier_basic, test/export/test_verifier.py::TestVerifier::test_ep_verifier_buffer_mutate, test/export/test_verifier.py::TestVerifier::test_ep_verifier_invalid_buffer, test/export/test_verifier.py::TestVerifier::test_ep_verifier_invalid_output, test/export/test_verifier.py::TestVerifier::test_ep_verifier_invalid_param, test/export/test_verifier.py::TestVerifier::test_verifier_basic, test/export/test_verifier.py::TestVerifier::test_verifier_call_module, test/export/test_verifier.py::TestVerifier::test_verifier_higher_order, test/export/test_verifier.py::TestVerifier::test_verifier_nested_invalid_module, test/export/test_verifier.py::TestVerifier::test_verifier_no_functional 2025-08-14T22:17:41.7207860Z 2025-08-14T22:17:45.8209978Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:17:45.8212305Z import pkg_resources 2025-08-14T22:17:46.2113002Z Running dynamo/test_recompile_ux 1/1 ... [2025-08-14 22:17:46.210734] 2025-08-14T22:17:46.2113725Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:17:46.2115975Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_recompile_ux.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:17:46.211189] 2025-08-14T22:17:50.9349796Z 2025-08-14T22:17:50.9350739Z dynamo/test_recompile_ux 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_recompile_ux_1.1_4520745ddbe41221_.log 2025-08-14T22:17:50.9354164Z Running 10 items in this shard: test/dynamo/test_recompile_ux.py::RecompileUxTests::test_drop_cache_on_skip, test/dynamo/test_recompile_ux.py::RecompileUxTests::test_dynamic_input, test/dynamo/test_recompile_ux.py::RecompileUxTests::test_fail_on_recompile_limit_hit, test/dynamo/test_recompile_ux.py::RecompileUxTests::test_loop_torture, test/dynamo/test_recompile_ux.py::RecompileUxTests::test_mismatched_type, test/dynamo/test_recompile_ux.py::RecompileUxTests::test_multiple_guard_fails, test/dynamo/test_recompile_ux.py::RecompileUxTests::test_multiple_guard_fails_report_all, test/dynamo/test_recompile_ux.py::RecompileUxTests::test_nvfuser_guards, test/dynamo/test_recompile_ux.py::RecompileUxTests::test_recompile_child_run_only, test/dynamo/test_recompile_ux.py::RecompileUxTests::test_verbose_tensor_check 2025-08-14T22:17:50.9357265Z 2025-08-14T22:17:55.0453631Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:17:55.0454969Z import pkg_resources 2025-08-14T22:17:55.4281819Z Running dynamo/test_minifier 1/1 ... [2025-08-14 22:17:55.427630] 2025-08-14T22:17:55.4282416Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:17:55.4284960Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_minifier.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:17:55.428057] 2025-08-14T22:17:59.9519389Z 2025-08-14T22:17:59.9520234Z dynamo/test_minifier 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_minifier_1.1_14ea3adfa1f435f4_.log 2025-08-14T22:17:59.9525885Z Running 15 items in this shard: test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cpu_accuracy_backend_passes_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cpu_accuracy_error_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cpu_compile_backend_passes_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cpu_compile_error_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cpu_runtime_backend_passes_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cpu_runtime_error_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cuda_accuracy_backend_passes_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cuda_accuracy_error_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cuda_compile_backend_passes_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cuda_compile_error_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cuda_runtime_backend_passes_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_cuda_runtime_error_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_after_dynamo_non_leaf_compile_error_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_cpu_cuda_module_after_dynamo_cuda, test/dynamo/test_minifier.py::MinifierTestsCUDA::test_if_graph_minified_cuda 2025-08-14T22:17:59.9531453Z 2025-08-14T22:18:04.0976180Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:18:04.0977509Z import pkg_resources 2025-08-14T22:18:04.4820265Z Running inductor/test_cudagraph_trees_expandable_segments 1/1 ... [2025-08-14 22:18:04.481456] 2025-08-14T22:18:04.4821170Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:18:04.4823014Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cudagraph_trees_expandable_segments.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:18:04.481849] 2025-08-14T22:18:12.0106425Z 2025-08-14T22:18:12.0107739Z inductor/test_cudagraph_trees_expandable_segments 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cudagraph_trees_expandable_segments_1.1_950b494803e8b65e_.log 2025-08-14T22:18:12.0169834Z Running 144 items in this shard: test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_accumulate_grad, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_accumulate_multiple_recordings, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_alias_of_parameter, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_aliased_output_checkpoint, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_aliased_static_parameter, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_aliased_storage_single_weakref, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_aliasing_static_ref, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_amp_cache_disabled, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_backward_gets_cached_cudagraphs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_cache_hit_forward_miss_backward, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_cached_boxed_forward_device_index, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_cached_forward_backward, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_checkpoint_shared_output_storage_deallocation, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_checkpointing_resets_persistent_refs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_cleanup, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_compiled_autograd_static_input_params, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_constant_output, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_conv_benchmark, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_cpp_wrapper, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_cudagraph_capture_sizes, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_cudagraph_capture_sizes1, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_cudagraph_capture_sizes2, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_dynamic_backward, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_dynamic_warmup, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_empty_cpu_tensor, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_empty_storage, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_end_recording_early, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_error_on_dealloc_use, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_error_on_dealloc_use2, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_execution_into_recording, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_expanded_inputs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_fallback_to_eager_if_recompiling_too_many_times, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_fallback_to_eager_if_recompiling_too_many_times_due_to_cudagraph_managed_tensor, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_fallback_to_eager_if_recompiling_too_many_times_warn_only_once, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_forward_backward, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_forward_backward_not_called_backend_cudagraphs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_forward_backward_not_called_backend_inductor, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_forward_generation, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_forward_with_skipped_cudagraphed_backward, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_frozen_fn, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_function_compiled_multiple_times, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_buffer_reuse, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_condition_op, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_cpu_only, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_cpu_op_and_dynamic_shapes, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar1, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar2, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar3, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar4, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar_device_put, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar_multiple, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_cpu_scalar_mutation, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_cpu_tensor_symints, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_custom_op, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_custom_op_dynamoc_shapes, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_custom_op_mutation, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_custom_op_no_split, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_dynamic_scalar_inputs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_dynamic_shapes, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_foreach_op, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_forward_backward, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_forward_backward_not_called, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_forward_with_skipped_cudagraphed_backward, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_fused_scheduler_node, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_gc, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_item, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_log_message, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_multiple_devices_msg, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_reduce_overhead_mode_effectiveness, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_reorder_cpu_and_gpu, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_reorder_cpu_and_gpu_interleave, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_reorder_custom_op_with_no_dependency, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_reorder_custom_op_with_no_dependency1, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_simple, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_symint, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_symint_cat_backward, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_symint_from_mutation_index, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_symint_from_nested_indirect_indexing, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_unbacked_symint, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_graph_partition_unbacked_symint_multi_output_layout, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_incompatible_cudagraph_ops_item, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_incompatible_cudagraph_ops_nonzero, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_incompatible_cudagraph_ops_nonzero_backend, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_incompatible_cudagraph_ops_nonzero_graph_breaks, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_index_put, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_live_outputs_multiple_graphs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_manager_per_device, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mark_step, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_meta_tensor, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_multi_dispatch_child_node, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_multi_dispatch_custom_module, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_multi_dispatch_custom_module_buffer, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_multi_dispatch_parent_node, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_multi_dispatch_single_compile_builtin_module, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_multi_dispatch_single_compile_builtin_module_buffers, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_multi_dispatch_single_compile_param_inputs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_multinomial, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_multiple_devices_msg_backend_cudagraphs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_multiple_devices_msg_backend_inductor, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_multiple_insert_removal_caching, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensor_warn_backend_cudagraphs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensor_warn_backend_inductor, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensor_warn_only_once_backend_cudagraphs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensor_warn_only_once_backend_inductor, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensors_backend_cudagraphs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensors_backend_inductor, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensors_config_backend_cudagraphs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mutation_cudagraph_managed_tensors_config_backend_inductor, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mutation_on_inp_backend_cudagraphs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mutation_on_inp_backend_inductor, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_mutation_reinplaced, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_no_rerecord_with_mark_static_address, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_not_fallback_to_eager_if_have_not_recompiling_too_many_times, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_output_alias, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_peristed_output_livenes, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_remove_hooks_on_cached_tensors, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_rerecord_if_static_input_address_changed, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_rng_non_trees, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_rng_trees, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_run_simple, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_separate_recordings, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_side_stream_memory_allocation, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_single_stream_use, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_skip_cpp_wrapper, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_skip_cudagraph_unsafe_ops, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_skip_if_dynamic_shape_limit_reached1, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_skip_if_dynamic_shape_limit_reached2, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_skip_symbolic, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_sparsity, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_static_inputs_address_mutation_log, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_storage_access_error, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_tensor_constant_mutation, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_tensor_dies_between_checkpoint, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_tensor_no_longer_in_pool, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_unaligned_static_input_no_cudagraphs, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_unaligned_static_input_non_trees, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_unaligned_static_input_trees, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_unaligned_static_parameter, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_unstable_ptr, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_warmup_stream_sync, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_warn_on_pending_backward, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_warn_once_if_dynamic_shape_limit_reached, test/inductor/test_cudagraph_trees_expandable_segments.py::CudaGraphTreeTests::test_workspace_allocation_error 2025-08-14T22:18:12.0229953Z 2025-08-14T22:18:16.0886168Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:18:16.0887507Z import pkg_resources 2025-08-14T22:18:16.4720215Z Running test_functionalization_of_rng_ops 1/1 ... [2025-08-14 22:18:16.471410] 2025-08-14T22:18:16.4720778Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:18:16.4722232Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_functionalization_of_rng_ops.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:18:16.471844] 2025-08-14T22:18:21.2963701Z 2025-08-14T22:18:21.2964937Z test_functionalization_of_rng_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_functionalization_of_rng_ops_1.1_b47c46664637f645_.log 2025-08-14T22:18:21.2969866Z Running 10 items in this shard: test/test_functionalization_of_rng_ops.py::TestFunctionalizationRngOpsCUDA::test_autograd_function_cuda_float32, test/test_functionalization_of_rng_ops.py::TestFunctionalizationRngOpsCUDA::test_checkpoint_cuda_float32, test/test_functionalization_of_rng_ops.py::TestFunctionalizationRngOpsCUDA::test_dropout_decomp_cuda_float32, test/test_functionalization_of_rng_ops.py::TestFunctionalizationRngOpsCUDA::test_min_cut_partitioner_cuda_float32, test/test_functionalization_of_rng_ops.py::TestFunctionalizationRngOpsCUDA::test_multiple_subgraphs_cuda_float32, test/test_functionalization_of_rng_ops.py::TestFunctionalizationRngOpsCUDA::test_rand_cuda_float32, test/test_functionalization_of_rng_ops.py::TestFunctionalizationRngOpsCUDA::test_rand_like_cuda_float32, test/test_functionalization_of_rng_ops.py::TestFunctionalizationRngOpsCUDA::test_rand_like_dynamic_bwd_cuda_float32, test/test_functionalization_of_rng_ops.py::TestFunctionalizationRngOpsCUDA::test_rand_like_dynamic_cuda_float32, test/test_functionalization_of_rng_ops.py::TestFunctionalizationRngOpsCUDA::test_set_get_rng_state_cuda_float32 2025-08-14T22:18:21.2974556Z 2025-08-14T22:18:25.4213055Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:18:25.4215422Z import pkg_resources 2025-08-14T22:18:25.8048205Z Running dynamo/test_hooks 1/1 ... [2025-08-14 22:18:25.804261] 2025-08-14T22:18:25.8048904Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:18:25.8051390Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_hooks.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:18:25.804735] 2025-08-14T22:18:30.6293219Z 2025-08-14T22:18:30.6294347Z dynamo/test_hooks 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_hooks_1.1_8d67aa3ed2424294_.log 2025-08-14T22:18:30.6305119Z Running 34 items in this shard: test/dynamo/test_hooks.py::HooksTests::test_complex_state_mutation_in_intermediary_hooks_same_on_inductor, test/dynamo/test_hooks.py::HooksTests::test_complex_state_mutation_in_intermediary_hooks_same_on_inductor_with_graph_break, test/dynamo/test_hooks.py::HooksTests::test_functools_arg_vary, test/dynamo/test_hooks.py::HooksTests::test_global_module_forward_pre_hook, test/dynamo/test_hooks.py::HooksTests::test_hook_with_closure, test/dynamo/test_hooks.py::HooksTests::test_hook_with_nested_closure, test/dynamo/test_hooks.py::HooksTests::test_input_hooks_same, test/dynamo/test_hooks.py::HooksTests::test_intermediary_hooks, test/dynamo/test_hooks.py::HooksTests::test_intermediary_hooks_same_on_aot_eager, test/dynamo/test_hooks.py::HooksTests::test_intermediary_hooks_same_on_inductor, test/dynamo/test_hooks.py::HooksTests::test_intermediate_hook_with_closure_aot, test/dynamo/test_hooks.py::HooksTests::test_intermediate_hook_with_closure_eager, test/dynamo/test_hooks.py::HooksTests::test_nnmodule_hook_guards, test/dynamo/test_hooks.py::HooksTests::test_no_recompile_on_hook_identity_change, test/dynamo/test_hooks.py::HooksTests::test_no_recompile_on_same_hook, test/dynamo/test_hooks.py::HooksTests::test_post_acc_grad_hook, test/dynamo/test_hooks.py::HooksTests::test_recompile, test/dynamo/test_hooks.py::HooksTests::test_register_hook_partial_guarding, test/dynamo/test_hooks.py::HooksTests::test_removed_handle_return, test/dynamo/test_hooks.py::HooksTests::test_tensor_only_register_hook_in_graph_lambda, test/dynamo/test_hooks.py::HooksTests::test_tensor_only_register_hook_in_graph_local, test/dynamo/test_hooks.py::HooksTests::test_tensor_only_register_hook_in_graph_local_inner, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_global_hook, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_global_hooks_handles_in_list, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_hook_in_graph_break_handle_lambda, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_hook_in_graph_break_handle_local, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_hook_in_graph_lambda, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_hook_in_graph_local, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_hook_multi_handle_return, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_hook_repeated_handle_not_local, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_hook_repeated_handle_return, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_multiple_hooks, test/dynamo/test_hooks.py::HooksTests::test_tensor_register_multiple_hooks_handles_in_list, test/dynamo/test_hooks.py::HooksTests::test_wrap_top_frame_with_hooks 2025-08-14T22:18:30.6314549Z 2025-08-14T22:18:34.7266959Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:18:34.7268285Z import pkg_resources 2025-08-14T22:18:35.1057481Z Running dynamo/test_generator 1/1 ... [2025-08-14 22:18:35.105275] 2025-08-14T22:18:35.1058037Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:18:35.1060385Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_generator.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:18:35.105676] 2025-08-14T22:18:39.9807405Z 2025-08-14T22:18:39.9809018Z dynamo/test_generator 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_generator_1.1_625062984680acce_.log 2025-08-14T22:18:39.9847459Z Running 75 items in this shard: test/dynamo/test_generator.py::GeneratorTests::test_cleanup_throw, test/dynamo/test_generator.py::GeneratorTests::test_deque_extendleft, test/dynamo/test_generator.py::GeneratorTests::test_dict_tuple_list_generator_container0, test/dynamo/test_generator.py::GeneratorTests::test_dict_tuple_list_generator_container1, test/dynamo/test_generator.py::GeneratorTests::test_dict_tuple_list_generator_container2, test/dynamo/test_generator.py::GeneratorTests::test_dict_tuple_list_generator_container3, test/dynamo/test_generator.py::GeneratorTests::test_dynamo_disable_generator, test/dynamo/test_generator.py::GeneratorTests::test_dynamo_disable_sub_generator, test/dynamo/test_generator.py::GeneratorTests::test_generator___contains__, test/dynamo/test_generator.py::GeneratorTests::test_generator___contains___side_effects, test/dynamo/test_generator.py::GeneratorTests::test_generator_as_argument, test/dynamo/test_generator.py::GeneratorTests::test_generator_as_argument_2, test/dynamo/test_generator.py::GeneratorTests::test_generator_as_argument_3, test/dynamo/test_generator.py::GeneratorTests::test_generator_as_argument_4, test/dynamo/test_generator.py::GeneratorTests::test_generator_simple, test/dynamo/test_generator.py::GeneratorTests::test_generator_with_side_effects, test/dynamo/test_generator.py::GeneratorTests::test_generator_with_side_effects_graph_break, test/dynamo/test_generator.py::GeneratorTests::test_generator_with_side_effects_graph_break_2, test/dynamo/test_generator.py::GeneratorTests::test_graph_break_and_reconstruct_generator, test/dynamo/test_generator.py::GeneratorTests::test_graph_break_before_calling_generator, test/dynamo/test_generator.py::GeneratorTests::test_graph_break_in_generator, test/dynamo/test_generator.py::GeneratorTests::test_graph_break_in_generator_2, test/dynamo/test_generator.py::GeneratorTests::test_graph_break_in_generator_while_reconstructing, test/dynamo/test_generator.py::GeneratorTests::test_graph_break_outside_generator, test/dynamo/test_generator.py::GeneratorTests::test_infinite_generator, test/dynamo/test_generator.py::GeneratorTests::test_infinite_generator_2, test/dynamo/test_generator.py::GeneratorTests::test_infinite_generator_3, test/dynamo/test_generator.py::GeneratorTests::test_islice_chain, test/dynamo/test_generator.py::GeneratorTests::test_iter, test/dynamo/test_generator.py::GeneratorTests::test_list_extend, test/dynamo/test_generator.py::GeneratorTests::test_list_zip_generator, test/dynamo/test_generator.py::GeneratorTests::test_reconstruct_generator_tensor_mutation, test/dynamo/test_generator.py::GeneratorTests::test_reconstruct_generator_with_dict_mutation, test/dynamo/test_generator.py::GeneratorTests::test_reconstruct_generator_with_dict_mutation_before, test/dynamo/test_generator.py::GeneratorTests::test_reconstruct_generator_with_local_var_mutation, test/dynamo/test_generator.py::GeneratorTests::test_reconstruct_generator_with_object_mutation, test/dynamo/test_generator.py::GeneratorTests::test_reconstruct_generator_with_object_mutation_before, test/dynamo/test_generator.py::GeneratorTests::test_return_advanced_generator, test/dynamo/test_generator.py::GeneratorTests::test_return_exhaust_generator, test/dynamo/test_generator.py::GeneratorTests::test_return_generator, test/dynamo/test_generator.py::GeneratorTests::test_return_subgenerator, test/dynamo/test_generator.py::GeneratorTests::test_return_tuple_generator, test/dynamo/test_generator.py::GeneratorTests::test_subgenerator, test/dynamo/test_generator.py::GeneratorTests::test_subgenerator_with_side_effects, test/dynamo/test_generator.py::GeneratorTests::test_zip_generator, test/dynamo/test_generator.py::GeneratorTests::test_zip_generator_2, test/dynamo/test_generator.py::GeneratorTests::test_zip_infinite_generator, test/dynamo/test_generator.py::GeneratorTests::test_zip_subgenerator, test/dynamo/test_generator.py::TestGeneratorSend::test_send, test/dynamo/test_generator.py::TestGeneratorSend::test_send_stop_iteration_fullgraph_False, test/dynamo/test_generator.py::TestGeneratorSend::test_send_stop_iteration_fullgraph_True, test/dynamo/test_generator.py::TestGeneratorClose::test_close, test/dynamo/test_generator.py::TestGeneratorClose::test_close_after_close, test/dynamo/test_generator.py::TestGeneratorClose::test_close_after_exception, test/dynamo/test_generator.py::TestGeneratorClose::test_close_capture_GeneratorExit_fullgraph_False, test/dynamo/test_generator.py::TestGeneratorClose::test_close_capture_GeneratorExit_fullgraph_True, test/dynamo/test_generator.py::TestGeneratorClose::test_close_capture_GeneratorExit_return, test/dynamo/test_generator.py::TestGeneratorClose::test_close_capture_and_reraise_GeneratorExit, test/dynamo/test_generator.py::TestGeneratorClose::test_close_capture_and_reraise_exc_exc0, test/dynamo/test_generator.py::TestGeneratorClose::test_close_capture_and_reraise_exc_exc1, test/dynamo/test_generator.py::TestGeneratorClose::test_close_handling_finally, test/dynamo/test_generator.py::TestGeneratorClose::test_close_subgen, test/dynamo/test_generator.py::TestGeneratorClose::test_close_with_side_effects, test/dynamo/test_generator.py::TestGeneratorClose::test_close_with_subgen, test/dynamo/test_generator.py::TestGeneratorClose::test_next_after_close_fullgraph_False, test/dynamo/test_generator.py::TestGeneratorClose::test_next_after_close_fullgraph_True, test/dynamo/test_generator.py::TestGeneratorThrow::test_exception_context_with_yield, test/dynamo/test_generator.py::TestGeneratorThrow::test_throw, test/dynamo/test_generator.py::TestGeneratorThrow::test_throw_no_yield_after_throw, test/dynamo/test_generator.py::TestGeneratorThrow::test_throw_not_catch, test/dynamo/test_generator.py::TestGeneratorThrow::test_throw_raise_difference_exc, test/dynamo/test_generator.py::TestGeneratorThrow::test_throw_try_except_finally, test/dynamo/test_generator.py::TestGeneratorThrow::test_throw_with_finally, test/dynamo/test_generator.py::TestGeneratorThrow::test_throw_without_finally, test/dynamo/test_generator.py::TestGeneratorThrow::test_throw_yield_finally 2025-08-14T22:18:39.9881451Z 2025-08-14T22:18:44.1144434Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:18:44.1145773Z import pkg_resources 2025-08-14T22:18:44.5090644Z Running dynamo/test_reorder_logs 1/1 ... [2025-08-14 22:18:44.508561] 2025-08-14T22:18:44.5091084Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:18:44.5093316Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_reorder_logs.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:18:44.508952] 2025-08-14T22:18:49.2834667Z 2025-08-14T22:18:49.2836054Z dynamo/test_reorder_logs 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_reorder_logs_1.1_05c8f33663369f30_.log 2025-08-14T22:18:49.2841556Z Running 14 items in this shard: test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method0_fn0_should_ignore_logger_False, test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method1_fn1_should_ignore_logger_False, test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method2_fn2_should_ignore_logger_False, test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method3_fn3_should_ignore_logger_False, test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method4_fn4_should_ignore_logger_True, test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method5_fn5_should_ignore_logger_True, test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method6_fn6_should_ignore_logger_True, test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method7_fn7_should_ignore_logger_True, test/dynamo/test_reorder_logs.py::ReorderLogsTests::test_constant_mutation, test/dynamo/test_reorder_logs.py::ReorderLogsTests::test_dont_reorder_print, test/dynamo/test_reorder_logs.py::ReorderLogsTests::test_reorder_custom_log_fn, test/dynamo/test_reorder_logs.py::ReorderLogsTests::test_reorder_print, test/dynamo/test_reorder_logs.py::ReorderLogsTests::test_reorder_print_graph_break, test/dynamo/test_reorder_logs.py::ReorderLogsTests::test_reorder_warnings 2025-08-14T22:18:49.2846966Z 2025-08-14T22:18:53.4290199Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:18:53.4291543Z import pkg_resources 2025-08-14T22:18:53.8097387Z Running dynamo/test_subclasses 1/1 ... [2025-08-14 22:18:53.809132] 2025-08-14T22:18:53.8097861Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:18:53.8100291Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_subclasses.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:18:53.809583] 2025-08-14T22:19:01.4393052Z 2025-08-14T22:19:01.4393811Z dynamo/test_subclasses 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_subclasses_1.1_94f61300d483e594_.log 2025-08-14T22:19:01.4435549Z Running 125 items in this shard: test/dynamo/test_subclasses.py::SubclassTests::test_as_subclass_attr_mutation, test/dynamo/test_subclasses.py::SubclassTests::test_base_torch_function_tracing, test/dynamo/test_subclasses.py::SubclassTests::test_compile_higher_order_with_functionalization, test/dynamo/test_subclasses.py::SubclassTests::test_compile_with_fake_tensor_automatic_dynamic, test/dynamo/test_subclasses.py::SubclassTests::test_compile_with_fake_tensor_dynamic_dim, test/dynamo/test_subclasses.py::SubclassTests::test_compile_with_functionalization, test/dynamo/test_subclasses.py::SubclassTests::test_disable_all_torch_function, test/dynamo/test_subclasses.py::SubclassTests::test_disable_all_torch_function_restore_values, test/dynamo/test_subclasses.py::SubclassTests::test_disable_all_torch_function_restore_values_graph_break, test/dynamo/test_subclasses.py::SubclassTests::test_has_torch_function, test/dynamo/test_subclasses.py::SubclassTests::test_make_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_mark_static_with_subclass_desugaring_dynamic_False, test/dynamo/test_subclasses.py::SubclassTests::test_mark_static_with_subclass_desugaring_dynamic_True, test/dynamo/test_subclasses.py::SubclassTests::test_newly_constructed_tensor_subclass_attr_mutation, test/dynamo/test_subclasses.py::SubclassTests::test_njt_subclass_from_buffer, test/dynamo/test_subclasses.py::SubclassTests::test_njt_subclass_from_cat, test/dynamo/test_subclasses.py::SubclassTests::test_njt_subclass_simple, test/dynamo/test_subclasses.py::SubclassTests::test_no_call_to_new, test/dynamo/test_subclasses.py::SubclassTests::test_no_torch_function_on_size_bytecode, test/dynamo/test_subclasses.py::SubclassTests::test_no_torch_function_recompiles, test/dynamo/test_subclasses.py::SubclassTests::test_nontraceable_tensor_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_overridden_method_guarding, test/dynamo/test_subclasses.py::SubclassTests::test_parameter_subclass_custom_torch_func_and_dynamic_attr, test/dynamo/test_subclasses.py::SubclassTests::test_parameter_subclass_with_old_torch_function, test/dynamo/test_subclasses.py::SubclassTests::test_recompile_with_symbool_inputs, test/dynamo/test_subclasses.py::SubclassTests::test_recompiles_with_optional_inner_tensor, test/dynamo/test_subclasses.py::SubclassTests::test_return_as_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_return_local_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_return_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_TwoTensor_TwoTensor_TwoTensor, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_TwoTensor_nested_diff_sizes, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_constructor_proxying, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_dont_invoke_torch_function_on_overridden_attr, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_dont_invoke_torch_function_on_overridden_method, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_override_shape_and_to, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_views_dynamic_False, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_views_dynamic_True, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_with_disabled_torch_function, test/dynamo/test_subclasses.py::SubclassTests::test_support_bases, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_automatic_dynamic_shapes, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_clone_view, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_different_shape, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_mark_dynamic_shapes, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_mul, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_nested, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_return_multiple, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_return_shape, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_return_tensor_and_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_simple, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_view, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_view_mul, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_attr_codegen_tos, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_ctx_custom_guards_error_arg_num, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_ctx_custom_guards_error_not_classmethod, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_ctx_custom_guards_override, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_ctx_guards, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_ctx_recursive_guards, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_custom_attr, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_with_non_classmethod_torch_function, test/dynamo/test_subclasses.py::SubclassTests::test_torch_dispatch_subclass_guard_recompile, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_call_on_attr, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_call_on_method, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_call_on_method_arg, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_list_args, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_state_graph_break, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_state_guards, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_state_nested, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_state_tracing, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_subclass_survives_into_aot_autograd, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_wrapper_class, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_wrapper_class_with_kwargs, test/dynamo/test_subclasses.py::SubclassTests::test_type_check_equality_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_type_check_equality_tensor, test/dynamo/test_subclasses.py::SubclassTests::test_type_check_identity_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_type_check_identity_tensor, test/dynamo/test_subclasses.py::SubclassTests::test_type_check_isinstance_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_type_check_isinstance_tensor, test/dynamo/test_subclasses.py::SubclassTests::test_user_overridden_attr_unsupported, test/dynamo/test_subclasses.py::SubclassTests::test_user_overridden_method_unsupported, test/dynamo/test_subclasses.py::SubclassTests::test_user_overridden_property_unsupported, test/dynamo/test_subclasses.py::SubclassTests::test_wrapper_subclass_dynamo_attribute_access_on_intermediate, test/dynamo/test_subclasses.py::SubclassTests::test_wrapper_subclass_guards_on_inner_tensor, test/dynamo/test_subclasses.py::SubclassTests::test_wrapper_subclass_with_differently_sized_inner_tensor, test/dynamo/test_subclasses.py::SubclassTests::test_wrapper_subclass_with_same_sized_inner_tensor, test/dynamo/test_subclasses.py::TestNestedTensor::test_basic_autograd, test/dynamo/test_subclasses.py::TestNestedTensor::test_basic_autograd_inductor, test/dynamo/test_subclasses.py::TestNestedTensor::test_binary_does_not_recompile, test/dynamo/test_subclasses.py::TestNestedTensor::test_binary_recompiles, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_input, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_input_2, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_input_4, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_input_5, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_input_6, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_intermediate, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_intermediate_2, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_intermediate_3, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_intermediate_4, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_intermediate_5, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_mixed, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_mixed_2, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_mixed_3, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_is_nested_call, test/dynamo/test_subclasses.py::TestNestedTensor::test_inference_tensor, test/dynamo/test_subclasses.py::TestNestedTensor::test_inline_nested_tensor_from_jagged, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_False_basic, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_False_leaf_False_False, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_False_leaf_False_True, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_False_leaf_True_False, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_False_leaf_True_True, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_False_obscure, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_True_basic, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_True_leaf_False_False, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_True_leaf_False_True, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_True_leaf_True_False, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_True_leaf_True_True, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_True_obscure, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_dense_subclass_dense_subclass, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_subclass_dense, test/dynamo/test_subclasses.py::TestNestedTensor::test_param_subclass_isinstance_input, test/dynamo/test_subclasses.py::TestNestedTensor::test_return_shape, test/dynamo/test_subclasses.py::TestNestedTensor::test_subclass_dense_subclass_dense_view, test/dynamo/test_subclasses.py::TestNestedTensor::test_subclass_gives_static_shapes_when_dynamic_false, test/dynamo/test_subclasses.py::TestNestedTensor::test_subclass_with_mutation_in_graph, test/dynamo/test_subclasses.py::TestNestedTensor::test_unary_does_not_recompile, test/dynamo/test_subclasses.py::TestNestedTensor::test_unbind 2025-08-14T22:19:01.4475685Z 2025-08-14T22:19:05.5620214Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:19:05.5621556Z import pkg_resources 2025-08-14T22:19:05.9397870Z Running export/test_sparse 1/1 ... [2025-08-14 22:19:05.939229] 2025-08-14T22:19:05.9398281Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:19:05.9399919Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_sparse.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:19:05.939608] 2025-08-14T22:19:10.8142948Z 2025-08-14T22:19:10.8143882Z export/test_sparse 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_sparse_1.1_4c590277200c60e8_.log 2025-08-14T22:19:10.8206429Z Running 203 items in this shard: test/export/test_sparse.py::TestSparseProp::test_activation_coo, test/export/test_sparse.py::TestSparseProp::test_activation_csr, test/export/test_sparse.py::TestSparseProp::test_add, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int64_SparseCSR 2025-08-14T22:19:10.8264877Z 2025-08-14T22:19:14.9069612Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:19:14.9071523Z import pkg_resources 2025-08-14T22:19:15.2878997Z Running functorch/test_eager_transforms 1/1 ... [2025-08-14 22:19:15.287358] 2025-08-14T22:19:15.2879880Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:19:15.2882200Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_eager_transforms.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:19:15.287785] 2025-08-14T22:19:21.8650061Z 2025-08-14T22:19:21.8651248Z functorch/test_eager_transforms 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_eager_transforms_1.1_9cee15fb557e4f13_.log 2025-08-14T22:19:21.8789937Z Running 355 items in this shard: test/functorch/test_eager_transforms.py::TestSliceArgnums::test_argnums_reorders, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_duplicate_argnums, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_flat_args_with_negative_int_argnum, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_flat_args_with_positive_int_argnum, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_flat_args_with_tuple_argnum, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_invalid_argnum_type, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_not_enough_argnums, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_out_of_bounds_argnum_values, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_pytree_args, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_buffer_tying, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_combine_state_for_ensemble_error, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_combine_state_for_ensemble_smoke, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_correctness_mnist_mechanism_functional_call, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_correctness_mnist_mechanism_make_functional, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_disable_autograd_tracking_disable_autograd_tracking_False, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_disable_autograd_tracking_disable_autograd_tracking_True, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_make_functional_state_correctly_returned_after_forward_mechanism_functional_call, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_make_functional_state_correctly_returned_after_forward_mechanism_make_functional, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_parameter_tying, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_parameter_tying_ensemble, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_parameter_tying_grad, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_stack_module_state_error, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_stack_module_state_leaf, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_stack_module_state_mismatch_error, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_stack_module_state_smoke, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_using_detach_functional_call_detach_params_False, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_using_detach_functional_call_detach_params_True, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_with_buffers_disable_autograd_tracking_disable_autograd_tracking_False, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_with_buffers_disable_autograd_tracking_disable_autograd_tracking_True, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_advanced_indexing_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_argnums_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_composed_with_autograd_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_composite_complicated_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_composite_simple_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_composite_two_ops_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_conj_bit_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_dtype_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_escaped_wrappers_are_ignored_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_escaped_wrappers_are_marked_as_dead_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_fn_with_kwargs_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_functional_init_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_functional_init_with_buffers_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_grad_aux_pytree_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_grad_aux_tensor_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_grad_of_vjp_composition_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_grad_of_vjp_of_grad_composition_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_grad_pytree_inputs_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_inplace_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_inplace_on_captures_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_inplace_on_view_base_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_inplace_on_view_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_invalid_argnums_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_is_cuda_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_manual_seed_inside_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_negative_argnums_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_nesting_simple_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_inside_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_mixed_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_nested_complicated_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_nested_simple_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_outside_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_outside_vjp_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_outside_vjp_fn_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_outside_vjp_only_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_value_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_numel_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_out_of_order_argnums_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_primitive_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_print_captured_tensor_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_shape_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_ctor_inside_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_print_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_print_grad_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_print_vmap_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_print_vmap_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_print_vmap_vmap_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_unrelated_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_unrelated_hessian_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_unrelated_vjp_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_unrelated_vjp_multiple_inputs_outputs_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_view_inplace_simple_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_views_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_aux_pytree_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_aux_tensor_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_of_grad_composition_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_outputs_can_any_pytree_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_pytree_error_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_pytree_input_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_pytree_output_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_two_outputs_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_zero_grad_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_log_softmax_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_new_empty_materializes_tensor_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_new_zeros_materializes_tensor_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_per_sample_grads_embeddingnet_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_per_sample_grads_embeddingnet_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_per_sample_grads_inplace_view_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_per_sample_grads_simple_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_correctness_different_devices_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_correctness_different_devices_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_default_arg_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_default_arg_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_multi_input_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_multi_input_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_multi_input_multi_output_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_multi_input_multi_output_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_simple_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_simple_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_unrelated_outputs_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_unrelated_outputs_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_zero_dim_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_zero_dim_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_defaults_to_zero_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_defaults_to_zero_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_effect_on_return_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_effect_on_return_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_tuple_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_tuple_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_aux_pytree_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_aux_pytree_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_aux_tensor_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_aux_tensor_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev__preallocate_and_copy_False_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev__preallocate_and_copy_True_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev_chunksize_one__preallocate_and_copy_False_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev_chunksize_one__preallocate_and_copy_True_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev_composition__preallocate_and_copy_False_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev_composition__preallocate_and_copy_True_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_complex_error_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_diff_numel_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_diff_numel_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_dimensionality_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_dimensionality_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_empty_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_empty_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_empty_output_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_empty_output_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_float_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_float_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_hessian_simple_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_inplace_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_inplace_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_jac_with_non_tensor_args_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_jac_with_non_tensor_args_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_args_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_args_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_outputs_pytree_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_outputs_pytree_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_outputs_pytree_multidim_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_outputs_pytree_multidim_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_pytree_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_pytree_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_multiple_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_multiple_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_pytree_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_pytree_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_single_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_single_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_negative_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_negative_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_nested_jac_simple_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_nested_jac_simple_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_out_of_bounds_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_out_of_bounds_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_outputs_can_any_pytree_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_outputs_can_any_pytree_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_repeated_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_repeated_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_simple_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_simple_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_simple_not_flat_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_simple_not_flat_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_take_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_take_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_unrelated_input_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_unrelated_input_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_unrelated_output_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_unrelated_output_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_vmap_on_jac_simple_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_vmap_on_jac_simple_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_autograd_function_disables_fwd_grad_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_aux_pytree_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_aux_tensor_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_disable_fwd_grad_inside_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_disable_fwd_grad_mixed_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_disable_fwd_grad_outside_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_inplace_on_captures_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_inputs_are_tuples_of_tensors_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_jvp_inside_autograd_function_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_jvp_new_tensor_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_multiple_inputs_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_multiple_inputs_outputs_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_multiple_outputs_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_nonempty_primals_and_tangents_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_outputs_can_any_pytree_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_primals_tangents_length_mismatch_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_pytree_inputs_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_pytree_inputs_error_cases_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_simple_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_strict_mode_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_unrelated_input_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_unrelated_output_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_zerotensor_vmapjvp_interaction_cuda, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_basic_cuda_float32, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_composition_grad_cuda_float32, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_composition_vmap_cuda_float32, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_errors_cuda, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_nested_input_nested_output_cuda_float32, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_return_cuda_float32, test/functorch/test_eager_transforms.py::TestVmapJvpInplaceViewCUDA::test_all_dual_base_inplace_cuda, test/functorch/test_eager_transforms.py::TestVmapJvpInplaceViewCUDA::test_all_dual_base_view_inplace_cuda, test/functorch/test_eager_transforms.py::TestVmapJvpInplaceViewCUDA::test_all_dual_no_view_cuda, test/functorch/test_eager_transforms.py::TestVmapJvpInplaceViewCUDA::test_right_dual_base_prop_cuda, test/functorch/test_eager_transforms.py::TestVmapJvpInplaceViewCUDA::test_right_dual_view_prop_cuda, test/functorch/test_eager_transforms.py::TestHessianCUDA::test_hessian_vectorize_correctness_multi_input_cuda, test/functorch/test_eager_transforms.py::TestHessianCUDA::test_hessian_vectorize_correctness_simple_cuda, test/functorch/test_eager_transforms.py::TestHessianCUDA::test_hessian_vectorize_correctness_unrelated_outputs_cuda, test/functorch/test_eager_transforms.py::TestHessianCUDA::test_jacfwd_different_levels_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_functionalize_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_grad_and_value_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_hessian_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_jacrev_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_vmap_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_functional_jacfwd_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_functional_jacrev_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_functional_jvp_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_functional_vjp_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_can_use_functionalize_when_key_is_excluded_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_can_use_grad_when_key_is_excluded_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_can_use_vmap_when_key_is_excluded_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_functionalize_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_grad_and_value_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_hessian_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_jacrev_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_vmap_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_grad_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_grad_vjp_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_grad_vmap_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_jvp_supports_saved_tensor_hooks_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_make_fx_jacrev_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_make_fx_vjp_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_make_fx_vmap_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_no_warning_on_import_functorch_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_requires_grad_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_retain_grad_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_transforms_dont_support_saved_tensor_hooks_transform_grad_and_value_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_transforms_dont_support_saved_tensor_hooks_transform_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_transforms_dont_support_saved_tensor_hooks_transform_hessian_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_transforms_dont_support_saved_tensor_hooks_transform_jacrev_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vjp_doesnt_support_saved_tensor_hooks_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vjp_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vjp_vjp_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vjp_vmap_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vmap_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vmap_vjp_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vmap_vmap_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_ensemble_regression_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_ensemble_regression_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_AlphaDropout_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_AlphaDropout_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_Dropout_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_Dropout_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_FeatureAlphaDropout_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_FeatureAlphaDropout_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_lennard_jones_batched_jac_jac_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_lennard_jones_batched_jac_jac_jacrev_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_maml_omniglot_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_maml_omniglot_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_maml_regression_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_maml_regression_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_resnet18_per_sample_grads_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_resnet18_per_sample_grads_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_update_batch_norm_mechanism_functional_call_originally_track_running_stats_False_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_update_batch_norm_mechanism_functional_call_originally_track_running_stats_True_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_update_batch_norm_mechanism_make_functional_originally_track_running_stats_False_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_update_batch_norm_mechanism_make_functional_originally_track_running_stats_True_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_basic_sum_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_functional_call_multiple_dicts_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_grad_grad_sum_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_grad_name_wrapping_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_grad_sum_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_no_grad_inside_grad_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_no_grad_outside_grad_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_vmap_grad_sum_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_vmap_sum_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fake_tensors_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fx_multi_out_op_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fx_out_op_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fx_reapply_views_simple_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fx_simple_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fx_transpose_simple_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_grad_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_nonfunctional_output_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_opt_tensor_list_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_optional_tensorlist1_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_optional_tensorlist2_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_inplace_view_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_linear_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_multioutput_inplace_slice_view_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_multioutput_view_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_resize_program_inputs_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_simple_view_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_vmap_functionalize_jvp_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_input_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_input_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_neither_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_neither_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_output_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_output_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_input_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_input_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_neither_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_neither_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_output_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_output_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_input_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_input_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_neither_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_neither_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_output_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_output_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_input_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_input_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_neither_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_neither_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_output_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_output_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_grad_fn_name_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_needs_input_grads_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_once_differentiable_autograd_vjp_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_once_differentiable_grad_vjp_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_set_materialize_grads_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_has_vmap_staticmethod_and_has_generate_vmap_rule_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_in_dims_multiple_inputs_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_in_dims_single_input_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_incompatible_out_dims_error_msg_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_info_object_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_kwarg_only_tensors_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_no_vmap_staticmethod_and_no_generate_vmap_rule_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_none_returns_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_should_have_two_returns_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_skips_empty_layer_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_CtxWithSavedTensors_error_if_name_collision_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_CtxWithSavedTensors_nesting_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_CtxWithSavedTensors_overrides_saved_tensors_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_CtxWithSavedTensors_passthrough_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_debug_unwrap_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_reductify_leaf_cuda, test/functorch/test_eager_transforms.py::TestCompileTransformsCUDA::test_compile_vmap_hessian_cuda, test/functorch/test_eager_transforms.py::TestCompileTransformsCUDA::test_grad_deprecated_api_cuda 2025-08-14T22:19:21.8923195Z 2025-08-14T22:19:25.9751537Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:19:25.9753829Z import pkg_resources 2025-08-14T22:19:26.3535617Z Running export/test_draft_export 1/1 ... [2025-08-14 22:19:26.352996] 2025-08-14T22:19:26.3536350Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:19:26.3538149Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_draft_export.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:19:26.353401] 2025-08-14T22:19:31.1272364Z 2025-08-14T22:19:31.1273404Z export/test_draft_export 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_draft_export_1.1_aa5672157c894c89_.log 2025-08-14T22:19:31.1280755Z Running 21 items in this shard: test/export/test_draft_export.py::TestDraftExport::test_complex_data_dependent_expr, test/export/test_draft_export.py::TestDraftExport::test_constantify_unbacked_symbol, test/export/test_draft_export.py::TestDraftExport::test_cuda_memory_usage, test/export/test_draft_export.py::TestDraftExport::test_data_dependent_failure, test/export/test_draft_export.py::TestDraftExport::test_dedup_data_dependent_failure, test/export/test_draft_export.py::TestDraftExport::test_fake_infer_dense_in_memory_check, test/export/test_draft_export.py::TestDraftExport::test_masked_linear, test/export/test_draft_export.py::TestDraftExport::test_missing_meta_kernel_custom_op_basic, test/export/test_draft_export.py::TestDraftExport::test_missing_meta_kernel_custom_op_multiple_profiles, test/export/test_draft_export.py::TestDraftExport::test_missing_meta_kernel_custom_op_update_profile, test/export/test_draft_export.py::TestDraftExport::test_missing_meta_kernel_guard, test/export/test_draft_export.py::TestDraftExport::test_missing_meta_kernel_impl, test/export/test_draft_export.py::TestDraftExport::test_offsets, test/export/test_draft_export.py::TestDraftExport::test_override_incorrectly_aliasing_kernel, test/export/test_draft_export.py::TestDraftExport::test_override_mismatched_fake_kernel_with_unbacked_symbols, test/export/test_draft_export.py::TestDraftExport::test_override_size_and_dtype_mismatched_fake_kernels, test/export/test_draft_export.py::TestDraftExport::test_shape_failure, test/export/test_draft_export.py::TestDraftExport::test_side_effect1, test/export/test_draft_export.py::TestDraftExport::test_side_effect_inps, test/export/test_draft_export.py::TestDraftExport::test_torchbind, test/export/test_draft_export.py::TestDraftExport::test_unbacked_div_mod_replacement 2025-08-14T22:19:31.1287665Z 2025-08-14T22:19:35.2761308Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:19:35.2762642Z import pkg_resources 2025-08-14T22:19:35.6651104Z Running dynamo/test_comptime 1/1 ... [2025-08-14 22:19:35.664546] 2025-08-14T22:19:35.6651795Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:19:35.6653329Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_comptime.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:19:35.664947] 2025-08-14T22:19:40.3905306Z 2025-08-14T22:19:40.3906049Z dynamo/test_comptime 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_comptime_1.1_2382325d9cd30383_.log 2025-08-14T22:19:40.3909735Z Running 12 items in this shard: test/dynamo/test_comptime.py::ComptimeTests::test_get_local, test/dynamo/test_comptime.py::ComptimeTests::test_get_local_closure_variable, test/dynamo/test_comptime.py::ComptimeTests::test_graph_break, test/dynamo/test_comptime.py::ComptimeTests::test_print_bt, test/dynamo/test_comptime.py::ComptimeTests::test_print_direct, test/dynamo/test_comptime.py::ComptimeTests::test_print_disas, test/dynamo/test_comptime.py::ComptimeTests::test_print_graph, test/dynamo/test_comptime.py::ComptimeTests::test_print_guards, test/dynamo/test_comptime.py::ComptimeTests::test_print_locals, test/dynamo/test_comptime.py::ComptimeTests::test_print_single, test/dynamo/test_comptime.py::ComptimeTests::test_print_value_stack, test/dynamo/test_comptime.py::ComptimeTests::test_sleep 2025-08-14T22:19:40.3912561Z 2025-08-14T22:19:44.5127156Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:19:44.5128492Z import pkg_resources 2025-08-14T22:19:44.9026731Z Running dynamo/test_python_autograd 1/1 ... [2025-08-14 22:19:44.902185] 2025-08-14T22:19:44.9027182Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:19:44.9031270Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_python_autograd.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:19:44.902748] 2025-08-14T22:19:49.6278353Z 2025-08-14T22:19:49.6279262Z dynamo/test_python_autograd 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_python_autograd_1.1_722ac34e8dc2525e_.log 2025-08-14T22:19:49.6281312Z Running 5 items in this shard: test/dynamo/test_python_autograd.py::TestPythonAutograd::test_backwards1, test/dynamo/test_python_autograd.py::TestPythonAutograd::test_backwards2, test/dynamo/test_python_autograd.py::TestPythonAutograd::test_forwards1, test/dynamo/test_python_autograd.py::TestPythonAutograd::test_forwards2, test/dynamo/test_python_autograd.py::TestPythonAutograd::test_split 2025-08-14T22:19:49.6283116Z 2025-08-14T22:19:53.7028387Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:19:53.7030657Z import pkg_resources 2025-08-14T22:19:54.0864240Z Running test_quantization 1/4 ... [2025-08-14 22:19:54.085947] 2025-08-14T22:19:54.0864647Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:19:54.0867160Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_quantization.py', '-m', 'not serial', '--shard-id=1', '--num-shards=4', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:19:54.086377] 2025-08-14T22:24:19.0932931Z 2025-08-14T22:24:19.0934106Z inductor/test_torchinductor_opinfo 6/11 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_6.11_ea5dbff01e7c09b9_.log 2025-08-14T22:24:19.1073458Z Running 321 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_T_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rdiv___cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmul___cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__batch_norm_with_update_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__chunk_cat_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__chunk_cat_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_abs_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acosh_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acosh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acosh_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_add_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addcdiv_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addr_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_alias_copy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_all_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amin_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_aminmax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_any_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_any_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_arange_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmin_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argsort_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argsort_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argwhere_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_scatter_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_scatter_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_scatter_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asinh_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atanh_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_2d_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_3d_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bfloat16_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bfloat16_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bfloat16_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bincount_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_and_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bmm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_tensors_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_tensors_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_to_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bucketize_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_byte_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cartesian_prod_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cat_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ceil_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_char_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cholesky_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_min_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_min_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clone_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_combinations_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_copysign_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cosh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cross_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummax_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumprod_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_deg2rad_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diag_embed_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagflat_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_digamma_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_permuted_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_strided_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_equal_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfinv_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfinv_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_as_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft2_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft2_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftn_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftn_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftshift_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftshift_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfft2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftn_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftn_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftshift_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftshift_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft2_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfftn_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flip_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fliplr_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flipud_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flipud_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_like_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gcd_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_geometric_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_half_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hash_tensor_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_histc_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_i0_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amax_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isin_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isinf_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_item_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_unary_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_kron_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_kthvalue_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ldexp_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ldexp_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cross_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_diagonal_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_ldl_factor_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_norm_subgradients_at_zero_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_qr_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_solve_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_svdvals_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vander_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vecdot_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vector_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_tensor_overload_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_with_dtype_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logdet_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_not_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_not_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_or_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_or_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logit_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logit_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mT_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_argmin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_logsumexp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_mean_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_binary_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_binary_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_maximum_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_median_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_binary_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_no_dim_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_with_dim_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_minimum_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_movedim_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_narrow_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ne_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ne_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_full_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nextafter_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nextafter_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_max_pool3d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_alpha_dropout_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_avg_pool2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv_transpose2d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_dropout2d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_embedding_bag_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_fractional_max_pool2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_nearest-exact_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_trilinear_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_layer_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_pool2d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool1d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_mse_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_circular_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_constant_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu6_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_relu_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_selu_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_smooth_l1_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softmin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softplus_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softsign_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softsign_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_upsample_bilinear_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_upsample_nearest_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_normal_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_like_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_outer_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_outer_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_copy_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_put_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_qr_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rand_like_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randint_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randint_like_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ravel_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_interleave_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize__cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize_as__cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_conj_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_roll_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_roll_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_round_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_round_decimals_3_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_add_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_amax_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_sum_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_select_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_short_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sigmoid_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinc_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_softmax_with_dtype_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_softmax_with_dtype_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sort_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sparse_mm_reduce_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sparse_mm_reduce_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_j1_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y1_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_t_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_t_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_u_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_entr_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_erfcx_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_hermite_polynomial_h_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_hermite_polynomial_h_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i0e_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_i1_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_k0_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_ndtr_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_u_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_list_args_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_stack_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_stack_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_stft_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_take_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tan_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tan_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tanh_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tensor_split_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tensordot_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_sparse_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_topk_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trapz_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_triangular_solve_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_indices_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trunc_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unflatten_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unflatten_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_consecutive_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_as_complex_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_where_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_xlogy_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_like_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_like_cuda_float64 2025-08-14T22:24:19.1206906Z 2025-08-14T22:24:23.2634428Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:24:23.2636193Z import pkg_resources 2025-08-14T22:24:23.6580505Z Running test_dynamic_shapes 1/1 ... [2025-08-14 22:24:23.657527] 2025-08-14T22:24:23.6580930Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:24:23.6582630Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_dynamic_shapes.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:24:23.657963] 2025-08-14T22:24:29.0341325Z 2025-08-14T22:24:29.0342195Z test_dynamic_shapes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_dynamic_shapes_1.1_84d10da10575cb0c_.log 2025-08-14T22:24:29.0460730Z Running 331 items in this shard: test/test_dynamic_shapes.py::TestPySymInt::test_arith_ops, test/test_dynamic_shapes.py::TestPySymInt::test_aten_ops, test/test_dynamic_shapes.py::TestPySymInt::test_avoid_unbacked_substitution, test/test_dynamic_shapes.py::TestPySymInt::test_backed_size_oblivious_01_spec, test/test_dynamic_shapes.py::TestPySymInt::test_baddbmm_symint, test/test_dynamic_shapes.py::TestPySymInt::test_binary, test/test_dynamic_shapes.py::TestPySymInt::test_data_dependent_guard, test/test_dynamic_shapes.py::TestPySymInt::test_data_dependent_guard_propagate_real_tensors, test/test_dynamic_shapes.py::TestPySymInt::test_debug_has_internal_overlap_unbacked, test/test_dynamic_shapes.py::TestPySymInt::test_deepcopy, test/test_dynamic_shapes.py::TestPySymInt::test_duck_shape, test/test_dynamic_shapes.py::TestPySymInt::test_ephemeral_source_simplification, test/test_dynamic_shapes.py::TestPySymInt::test_ephemeral_source_unified_with_non_ephemeral_source, test/test_dynamic_shapes.py::TestPySymInt::test_expect_true_basic, test/test_dynamic_shapes.py::TestPySymInt::test_expect_true_double_digits, test/test_dynamic_shapes.py::TestPySymInt::test_expect_true_prefer_later, test/test_dynamic_shapes.py::TestPySymInt::test_expect_true_refine_range, test/test_dynamic_shapes.py::TestPySymInt::test_expect_true_with_s0, test/test_dynamic_shapes.py::TestPySymInt::test_floor_clean_div_axioms, test/test_dynamic_shapes.py::TestPySymInt::test_floordiv_static, test/test_dynamic_shapes.py::TestPySymInt::test_fx_trace_intlist, test/test_dynamic_shapes.py::TestPySymInt::test_guard_int, test/test_dynamic_shapes.py::TestPySymInt::test_guard_refine_range, test/test_dynamic_shapes.py::TestPySymInt::test_int_bool, test/test_dynamic_shapes.py::TestPySymInt::test_int_conversion, test/test_dynamic_shapes.py::TestPySymInt::test_int_to_float, test/test_dynamic_shapes.py::TestPySymInt::test_max_of_unique_summation_opt, test/test_dynamic_shapes.py::TestPySymInt::test_meta_symint, test/test_dynamic_shapes.py::TestPySymInt::test_mul_int_oo_nan, test/test_dynamic_shapes.py::TestPySymInt::test_non_overlapping_and_dense, test/test_dynamic_shapes.py::TestPySymInt::test_non_overlapping_and_dense_unbacked, test/test_dynamic_shapes.py::TestPySymInt::test_numel, test/test_dynamic_shapes.py::TestPySymInt::test_numpy_sym_max, test/test_dynamic_shapes.py::TestPySymInt::test_numpy_sym_min, test/test_dynamic_shapes.py::TestPySymInt::test_prefer_deferred_runtime_assertions_over_guards, test/test_dynamic_shapes.py::TestPySymInt::test_print_readable_with_symints, test/test_dynamic_shapes.py::TestPySymInt::test_reverse_arith_ops, test/test_dynamic_shapes.py::TestPySymInt::test_roundtrip, test/test_dynamic_shapes.py::TestPySymInt::test_size_expressions, test/test_dynamic_shapes.py::TestPySymInt::test_specialize_zero_one, test/test_dynamic_shapes.py::TestPySymInt::test_statically_known_false, test/test_dynamic_shapes.py::TestPySymInt::test_statically_known_true, test/test_dynamic_shapes.py::TestPySymInt::test_stride, test/test_dynamic_shapes.py::TestPySymInt::test_sym_ceil, test/test_dynamic_shapes.py::TestPySymInt::test_sym_floor, test/test_dynamic_shapes.py::TestPySymInt::test_sym_int, test/test_dynamic_shapes.py::TestPySymInt::test_sym_ite, test/test_dynamic_shapes.py::TestPySymInt::test_sym_log2, test/test_dynamic_shapes.py::TestPySymInt::test_sym_max_multi_max_simplify, test/test_dynamic_shapes.py::TestPySymInt::test_sym_sqrt, test/test_dynamic_shapes.py::TestPySymInt::test_sym_sum, test/test_dynamic_shapes.py::TestPySymInt::test_sym_trunc, test/test_dynamic_shapes.py::TestPySymInt::test_symint_args, test/test_dynamic_shapes.py::TestPySymInt::test_symint_as_scalar, test/test_dynamic_shapes.py::TestPySymInt::test_symint_bitwise_and, test/test_dynamic_shapes.py::TestPySymInt::test_symint_bitwise_or, test/test_dynamic_shapes.py::TestPySymInt::test_symint_vargs, test/test_dynamic_shapes.py::TestPySymInt::test_sympify_symint, test/test_dynamic_shapes.py::TestPySymInt::test_sympy_optimized_add, test/test_dynamic_shapes.py::TestPySymInt::test_sympy_optimized_add_binary_search, test/test_dynamic_shapes.py::TestPySymInt::test_tensor_factory_with_symint, test/test_dynamic_shapes.py::TestPySymInt::test_tracing_sym_ite, test/test_dynamic_shapes.py::TestPySymInt::test_unbacked_substitution, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_abs, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_add, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_and, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_bitwise_and, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_bitwise_or, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_ceil, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_eq, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_float_pow, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_float_truediv, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_floor, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_ge, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_gt, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_int_floordiv, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_int_truediv, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_is_integer, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_le, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_lshift, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_lt, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_mod, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_mul, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_ne, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_neg, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_or, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_pos, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_pow_by_natural, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_round, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_rshift, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sub, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_acos, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_asin, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_atan, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_cos, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_cosh, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_ite, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_log2, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_max, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_min, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_not, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_sin, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_sinh, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_sqrt, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_tan, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_tanh, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_trunc, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_abs_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_abs_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_abs_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_abs_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_add_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_add_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_add_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_add_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_and_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_and_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_and_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_and_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_bitwise_and_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_bitwise_and_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_bitwise_and_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_bitwise_and_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_bitwise_or_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_bitwise_or_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_bitwise_or_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_bitwise_or_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ceil_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ceil_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ceil_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ceil_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_eq_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_eq_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_eq_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_eq_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_float_pow_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_float_pow_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_float_pow_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_float_pow_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_float_truediv_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_float_truediv_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_float_truediv_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_float_truediv_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_floor_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_floor_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_floor_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_floor_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ge_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ge_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ge_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ge_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_gt_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_gt_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_gt_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_gt_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_int_floordiv_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_int_floordiv_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_int_floordiv_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_int_floordiv_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_int_truediv_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_int_truediv_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_int_truediv_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_int_truediv_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_is_integer_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_is_integer_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_is_integer_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_is_integer_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_le_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_le_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_le_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_le_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_lshift_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_lshift_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_lshift_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_lshift_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_lt_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_lt_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_lt_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_lt_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_mod_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_mod_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_mod_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_mod_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_mul_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_mul_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_mul_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_mul_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ne_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ne_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ne_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ne_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_neg_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_neg_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_neg_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_neg_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_or_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_or_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_or_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_or_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_pos_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_pos_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_pos_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_pos_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_pow_by_natural_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_pow_by_natural_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_pow_by_natural_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_pow_by_natural_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_round_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_round_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_round_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_round_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_rshift_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_rshift_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_rshift_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_rshift_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sub_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sub_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sub_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sub_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_acos_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_acos_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_acos_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_acos_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_asin_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_asin_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_asin_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_asin_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_atan_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_atan_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_atan_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_atan_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_cos_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_cos_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_cos_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_cos_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_cosh_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_cosh_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_cosh_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_cosh_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_float_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_float_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_float_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_float_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_ite_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_ite_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_ite_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_ite_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_log2_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_log2_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_log2_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_log2_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_max_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_max_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_max_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_max_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_min_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_min_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_min_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_min_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_not_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_not_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_not_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_not_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sin_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sin_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sin_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sin_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sinh_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sinh_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sinh_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sinh_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sqrt_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sqrt_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sqrt_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sqrt_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_tan_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_tan_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_tan_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_tan_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_tanh_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_tanh_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_tanh_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_tanh_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_trunc_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_trunc_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_trunc_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_trunc_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_non_symbolic_symnode, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_stride_symnode, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_symint_deepcopy, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_symint_hashing, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_symnode_hashing, test/test_dynamic_shapes.py::TestFloorDiv::test_floordiv_assumptions, test/test_dynamic_shapes.py::TestFloorDiv::test_floordiv_div_by_one, test/test_dynamic_shapes.py::TestFloorDiv::test_floordiv_float_int, test/test_dynamic_shapes.py::TestFloorDiv::test_floordiv_simplify, test/test_dynamic_shapes.py::TestDimConstraints::test_dim_constraints_reduce_congruences_simple, test/test_dynamic_shapes.py::TestDimConstraints::test_dim_constraints_reduce_inequalities_error, test/test_dynamic_shapes.py::TestDimConstraints::test_dim_constraints_reduce_inequalities_simple, test/test_dynamic_shapes.py::TestDimConstraints::test_dim_constraints_solve_full, test/test_dynamic_shapes.py::TestDimConstraints::test_simplify_max_1_0, test/test_dynamic_shapes.py::TestGuardsExpressions::test_guard_or_false, test/test_dynamic_shapes.py::TestGuardsExpressions::test_guard_or_true, test/test_dynamic_shapes.py::TestGuardsExpressions::test_guards_float_div, test/test_dynamic_shapes.py::TestGuardsExpressions::test_guards_float_print, test/test_dynamic_shapes.py::TestGuardsExpressions::test_guards_gt_lt, test/test_dynamic_shapes.py::TestGuardsExpressions::test_remove_symbols_without_guarding, test/test_dynamic_shapes.py::TestUnbacked::test_deferred_neq_assert_backend_eager, test/test_dynamic_shapes.py::TestUnbacked::test_deferred_neq_assert_backend_inductor, test/test_dynamic_shapes.py::TestUnbacked::test_deferred_sym_eq_assert_backend_eager, test/test_dynamic_shapes.py::TestUnbacked::test_deferred_sym_eq_assert_backend_inductor, test/test_dynamic_shapes.py::TestUnbacked::test_deferred_sym_or_assert_backend_eager, test/test_dynamic_shapes.py::TestUnbacked::test_deferred_sym_or_assert_backend_inductor, test/test_dynamic_shapes.py::TestUnbacked::test_deferred_with_unbacked_input_backend_eager, test/test_dynamic_shapes.py::TestUnbacked::test_deferred_with_unbacked_input_backend_inductor, test/test_dynamic_shapes.py::TestUnbacked::test_has_free_symbols, test/test_dynamic_shapes.py::TestUnbacked::test_post_specialize_runtime_assert1_backend_eager, test/test_dynamic_shapes.py::TestUnbacked::test_post_specialize_runtime_assert1_backend_inductor, test/test_dynamic_shapes.py::TestUnbacked::test_post_specialize_runtime_assert2_backend_eager, test/test_dynamic_shapes.py::TestUnbacked::test_post_specialize_runtime_assert2_backend_inductor, test/test_dynamic_shapes.py::TestUbackedOps::test_invalid_view_unbacked_view, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_contiguous, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_non_contigious_reshape_failing, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_reshape1, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_reshape2, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_select2, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_select_index, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_select_index_cpp_wrapper, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_select_index_with_check, test/test_dynamic_shapes.py::TestUbackedOps::test_unbind_not_dynamic 2025-08-14T22:24:29.0567381Z 2025-08-14T22:24:33.1675699Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:24:33.1678065Z import pkg_resources 2025-08-14T22:24:33.5513316Z Running higher_order_ops/test_invoke_subgraph 1/1 ... [2025-08-14 22:24:33.550727] 2025-08-14T22:24:33.5513813Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:24:33.5515596Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'higher_order_ops/test_invoke_subgraph.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:24:33.551193] 2025-08-14T22:24:41.0302498Z 2025-08-14T22:24:41.0303542Z higher_order_ops/test_invoke_subgraph 1/1 was successful, full logs can be found in artifacts with path test/test-reports/higher_order_ops.test_invoke_subgraph_1.1_61913ed1c9421b98_.log 2025-08-14T22:24:41.0327858Z Running 66 items in this shard: test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraph::test_aot_function, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraph::test_multiple, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraph::test_simple, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_ac, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_ac_rng, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_auto_functionalize, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_autograd_function, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_buffer_mutation_errors_under_training, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_buffer_mutation_works_under_no_grad, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_bwd_partitioning, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_complex, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_const_tensor, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_dce, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_dedupe, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_different_strides_in_backward, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_different_symint, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_differing_strides_for_grad_outs, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_div, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_dropout_checks_joint_graph, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_dropout_checks_joint_graph_inference, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_dynamic, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_fail_with_direct_invoke_subgraph, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_fake_tensor_checking, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_gen_schema, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_gen_schema_with_buffer_mutation, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_input_input_aliasing, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_input_mutation, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_input_mutation_inference_mode, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_input_mutation_mutiple_times, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_input_mutation_mutiple_times_fake_tensor_cahche_hit, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_input_output_aliasing, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_kwargs_only, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_list, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_mod_attr_aliasing, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_module, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_module_forward, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_module_method, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_nonlocal_update, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_normalize_gm, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_output_output_aliasing, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_pending_unbacked, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_preserves_output_strides, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_preserves_strides, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_redundant_compile_region, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_return_none, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_return_none_from_fwd, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_sdpa, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_simple, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_simple_module, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_symint_from_fwd_to_bwd, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_triton_kernel_native, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_tuple_of_tuple, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_unbacked, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_unbacked_symbol, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_view_to_reshape, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphExportNonstrict::test_multiple_module, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphExportNonstrict::test_pending_unbacked, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphExportNonstrict::test_simple_func, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphExportNonstrict::test_simple_method, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphExportNonstrict::test_unbacked, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphExportStrict::test_multiple_module, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphExportStrict::test_pending_unbacked, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphExportStrict::test_simple_func, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphExportStrict::test_simple_method, test/higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphExportStrict::test_unbacked, test/higher_order_ops/test_invoke_subgraph.py::NegativeTesting::test_graph_break 2025-08-14T22:24:41.0351497Z 2025-08-14T22:24:45.0995747Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:24:45.0997098Z import pkg_resources 2025-08-14T22:24:45.4857075Z Running test_package 1/1 ... [2025-08-14 22:24:45.485152] 2025-08-14T22:24:45.4857732Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:24:45.4858722Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_package.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:24:45.485549] 2025-08-14T22:24:50.4595678Z 2025-08-14T22:24:50.4596562Z test_package 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_package_1.1_99aedd6e8acbed92_.log 2025-08-14T22:24:50.4627341Z Running 137 items in this shard: test/test_package.py::TestAnalyze::test_trace_dependencies, test/test_package.py::TestDependencyAPI::test_allow_empty_with_error, test/test_package.py::TestDependencyAPI::test_broken_dependency, test/test_package.py::TestDependencyAPI::test_deny, test/test_package.py::TestDependencyAPI::test_deny_glob, test/test_package.py::TestDependencyAPI::test_extern, test/test_package.py::TestDependencyAPI::test_extern_glob, test/test_package.py::TestDependencyAPI::test_extern_glob_allow_empty, test/test_package.py::TestDependencyAPI::test_externing_c_extension, test/test_package.py::TestDependencyAPI::test_implicit_intern, test/test_package.py::TestDependencyAPI::test_intern_error, test/test_package.py::TestDependencyAPI::test_invalid_import, test/test_package.py::TestDependencyAPI::test_mock, test/test_package.py::TestDependencyAPI::test_mock_glob, test/test_package.py::TestDependencyAPI::test_mock_glob_allow_empty, test/test_package.py::TestDependencyAPI::test_pickle_mocked, test/test_package.py::TestDependencyAPI::test_pickle_mocked_all, test/test_package.py::TestDependencyAPI::test_repackage_mocked_module, test/test_package.py::TestDependencyHooks::test_extern_and_mock_hook, test/test_package.py::TestDependencyHooks::test_multiple_extern_hooks, test/test_package.py::TestDependencyHooks::test_multiple_mock_hooks, test/test_package.py::TestDependencyHooks::test_remove_hooks, test/test_package.py::TestDependencyHooks::test_single_hook, test/test_package.py::TestDiGraph::test_all_paths, test/test_package.py::TestDiGraph::test_contains, test/test_package.py::TestDiGraph::test_contains_non_hashable, test/test_package.py::TestDiGraph::test_edges, test/test_package.py::TestDiGraph::test_forward_closure, test/test_package.py::TestDiGraph::test_iter, test/test_package.py::TestDiGraph::test_node_attr_update, test/test_package.py::TestDiGraph::test_node_attrs, test/test_package.py::TestDiGraph::test_predecessor_not_in_graph, test/test_package.py::TestDiGraph::test_predecessors, test/test_package.py::TestDiGraph::test_successor_not_in_graph, test/test_package.py::TestDiGraph::test_successors, test/test_package.py::DirectoryReaderTest::test_importer_access, test/test_package.py::DirectoryReaderTest::test_loading_has_record, test/test_package.py::DirectoryReaderTest::test_loading_module, test/test_package.py::DirectoryReaderTest::test_loading_pickle, test/test_package.py::DirectoryReaderTest::test_package_resource_access, test/test_package.py::DirectoryReaderTest::test_resource_access_by_path, test/test_package.py::DirectoryReaderTest::test_resource_reader, test/test_package.py::DirectoryReaderTest::test_scriptobject_failure_message, test/test_package.py::TestGlobGroup::test_exclude, test/test_package.py::TestGlobGroup::test_exclude_from_all, test/test_package.py::TestGlobGroup::test_invalid_raw, test/test_package.py::TestGlobGroup::test_list_include_exclude, test/test_package.py::TestGlobGroup::test_one_star, test/test_package.py::TestGlobGroup::test_one_star_middle, test/test_package.py::TestGlobGroup::test_one_star_multiple_in_component, test/test_package.py::TestGlobGroup::test_one_star_partial, test/test_package.py::TestGlobGroup::test_one_star_partial_extension, test/test_package.py::TestGlobGroup::test_raw_two_star, test/test_package.py::TestGlobGroup::test_two_star, test/test_package.py::TestGlobGroup::test_two_star_end, test/test_package.py::TestGlobGroup::test_two_star_middle, test/test_package.py::TestGlobGroup::test_two_star_multiple, test/test_package.py::TestImporter::test_ordered_importer_basic, test/test_package.py::TestImporter::test_ordered_importer_whichmodule, test/test_package.py::TestImporter::test_package_importer_whichmodule_no_dunder_module, test/test_package.py::TestImporter::test_single_ordered_importer, test/test_package.py::TestImporter::test_sys_importer, test/test_package.py::TestImporter::test_sys_importer_roundtrip, test/test_package.py::TestLoadBCPackages::test_load_bc_packages_fx_module, test/test_package.py::TestLoadBCPackages::test_load_bc_packages_nn_module, test/test_package.py::TestLoadBCPackages::test_load_bc_packages_torchscript_module, test/test_package.py::TestMangling::test_demangle_base, test/test_package.py::TestMangling::test_demangler_multiple_manglers, test/test_package.py::TestMangling::test_is_mangled, test/test_package.py::TestMangling::test_mangle_empty_errors, test/test_package.py::TestMangling::test_mangle_prefix, test/test_package.py::TestMangling::test_mangler_is_consistent, test/test_package.py::TestMangling::test_package_mangler, test/test_package.py::TestMangling::test_roundtrip_mangling, test/test_package.py::TestMangling::test_unique_manglers, test/test_package.py::TestMangling::test_unique_module_names, test/test_package.py::TestMisc::test_dunder_package_present, test/test_package.py::TestMisc::test_dunder_package_works_from_package, test/test_package.py::TestMisc::test_exporter_content_lists, test/test_package.py::TestMisc::test_file_structure, test/test_package.py::TestMisc::test_file_structure_has_file, test/test_package.py::TestMisc::test_inspect_class, test/test_package.py::TestMisc::test_is_from_package, test/test_package.py::TestMisc::test_load_python_version_from_package, test/test_package.py::TestMisc::test_loaders_that_remap_files_work_ok, test/test_package.py::TestMisc::test_python_version, test/test_package.py::TestMisc::test_std_lib_sys_hackery_checks, test/test_package.py::ModelTest::test_model_save, test/test_package.py::ModelTest::test_resnet, test/test_package.py::ModelTest::test_script_resnet, test/test_package.py::TestPackageFX::test_package_fx_custom_tracer, test/test_package.py::TestPackageFX::test_package_fx_package, test/test_package.py::TestPackageFX::test_package_fx_simple, test/test_package.py::TestPackageFX::test_package_fx_with_imports, test/test_package.py::TestPackageFX::test_package_fx_wrap, test/test_package.py::TestPackageFX::test_package_gm_preserve_stack_trace, test/test_package.py::TestPackageFX::test_package_then_fx, test/test_package.py::TestPackageScript::test_different_package_interface, test/test_package.py::TestPackageScript::test_different_package_script_class, test/test_package.py::TestPackageScript::test_load_shared_scriptmodules, test/test_package.py::TestPackageScript::test_load_shared_tensors, test/test_package.py::TestPackageScript::test_load_shared_tensors_repackaged, test/test_package.py::TestPackageScript::test_mixing_packaged_and_inline_modules, test/test_package.py::TestPackageScript::test_mixing_packaged_and_inline_modules_shared_code, test/test_package.py::TestPackageScript::test_package_interface, test/test_package.py::TestPackageScript::test_package_script_class, test/test_package.py::TestPackageScript::test_package_script_class_referencing_self, test/test_package.py::TestPackageScript::test_save_eager_mods_sharing_scriptmodule, test/test_package.py::TestPackageScript::test_save_independent_scriptmodules, test/test_package.py::TestPackageScript::test_save_repeat_scriptmodules, test/test_package.py::TestPackageScript::test_save_scriptmodule, test/test_package.py::TestPackageScript::test_save_scriptmodule_file, test/test_package.py::TestPackageScript::test_save_scriptmodule_only_necessary_code, test/test_package.py::TestPackageScript::test_save_scriptmodule_with_submods, test/test_package.py::TestPackageScript::test_save_scriptmodules_in_container, test/test_package.py::TestPackageScript::test_save_scriptmodules_submod_redefinition, test/test_package.py::TestPackageScript::test_save_shared_tensors, test/test_package.py::TestPackageScript::test_saving_and_scripting_packaged_mod, test/test_package.py::TestPackageScript::test_scriptmodules_repeat_save, test/test_package.py::TestPackageScript::test_tensor_sharing_pickle, test/test_package.py::TestRepackage::test_repackage_import_indirectly_via_parent_module, test/test_package.py::TestResources::test_importer_access, test/test_package.py::TestResources::test_package_resource_access, test/test_package.py::TestResources::test_resource_access_by_path, test/test_package.py::TestResources::test_resource_reader, test/test_package.py::TestSaveLoad::test_bad_dunder_imports, test/test_package.py::TestSaveLoad::test_dunder_imports, test/test_package.py::TestSaveLoad::test_exporting_mismatched_code, test/test_package.py::TestSaveLoad::test_pickle, test/test_package.py::TestSaveLoad::test_pickle_long_name_with_protocol_4, test/test_package.py::TestSaveLoad::test_save_imported_module, test/test_package.py::TestSaveLoad::test_save_imported_module_using_package_importer, test/test_package.py::TestSaveLoad::test_save_load_fp8, test/test_package.py::TestSaveLoad::test_save_module, test/test_package.py::TestSaveLoad::test_save_module_binary, test/test_package.py::TestSaveLoad::test_saving_source, test/test_package.py::TestSaveLoad::test_saving_string 2025-08-14T22:24:50.4658071Z 2025-08-14T22:24:54.6321909Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:24:54.6323528Z import pkg_resources 2025-08-14T22:24:55.0122353Z Running functorch/test_rearrange 1/1 ... [2025-08-14 22:24:55.011777] 2025-08-14T22:24:55.0122955Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:24:55.0125310Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_rearrange.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:24:55.012183] 2025-08-14T22:24:59.5353802Z 2025-08-14T22:24:59.5354703Z functorch/test_rearrange 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_rearrange_1.1_68d46f8f3fe8a6c3_.log 2025-08-14T22:24:59.5358111Z Running 10 items in this shard: test/functorch/test_rearrange.py::TestRearrange::test_0_dim_tensor, test/functorch/test_rearrange.py::TestRearrange::test_collapsed_ellipsis_errors_out, test/functorch/test_rearrange.py::TestRearrange::test_concatenations_and_stacking, test/functorch/test_rearrange.py::TestRearrange::test_dimension_mismatch_no_ellipsis, test/functorch/test_rearrange.py::TestRearrange::test_dimension_mismatch_with_ellipsis, test/functorch/test_rearrange.py::TestRearrange::test_ellipsis_ops, test/functorch/test_rearrange.py::TestRearrange::test_rearrange_consistency, test/functorch/test_rearrange.py::TestRearrange::test_rearrange_permutations, test/functorch/test_rearrange.py::TestRearrange::test_squeeze, test/functorch/test_rearrange.py::TestRearrange::test_unsqueeze 2025-08-14T22:24:59.5361210Z 2025-08-14T22:25:03.6542339Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:25:03.6543659Z import pkg_resources 2025-08-14T22:25:04.0461834Z Running functorch/test_parsing 1/1 ... [2025-08-14 22:25:04.045670] 2025-08-14T22:25:04.0462283Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:25:04.0465299Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_parsing.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:25:04.046143] 2025-08-14T22:25:08.6202454Z 2025-08-14T22:25:08.6203245Z functorch/test_parsing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_parsing_1.1_00b87e2aad5f0892_.log 2025-08-14T22:25:08.6208230Z Running 12 items in this shard: test/functorch/test_parsing.py::TestAnonymousAxis::test_anonymous_axes, test/functorch/test_parsing.py::TestParsedExpression::test_elementary_axis_name, test/functorch/test_parsing.py::TestParsedExpression::test_invalid_expressions, test/functorch/test_parsing.py::TestParsedExpression::test_parse_expression, test/functorch/test_parsing.py::TestParsingUtils::test_ellipsis_invalid_identifier, test/functorch/test_parsing.py::TestParsingUtils::test_ellipsis_matching, test/functorch/test_parsing.py::TestParsingUtils::test_left_parenthesized_ellipsis, test/functorch/test_parsing.py::TestParsingUtils::test_parse_pattern_number_of_arrows, test/functorch/test_parsing.py::TestValidateRearrangeExpressions::test_identifier_mismatch, test/functorch/test_parsing.py::TestValidateRearrangeExpressions::test_non_unitary_anonymous_axes_raises_error, test/functorch/test_parsing.py::TestValidateRearrangeExpressions::test_unexpected_axes_lengths, test/functorch/test_parsing.py::TestValidateRearrangeExpressions::test_validate_axes_lengths_are_integers 2025-08-14T22:25:08.6212469Z 2025-08-14T22:25:12.7182549Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:25:12.7183890Z import pkg_resources 2025-08-14T22:25:13.1097787Z Running test_hop_infra 1/1 ... [2025-08-14 22:25:13.109234] 2025-08-14T22:25:13.1098346Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:25:13.1114124Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_hop_infra.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:25:13.109668] 2025-08-14T22:25:18.0845034Z 2025-08-14T22:25:18.0846147Z test_hop_infra 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_hop_infra_1.1_3c5afc8550f98874_.log 2025-08-14T22:25:18.0848730Z Running 3 items in this shard: test/test_hop_infra.py::TestHOPInfra::test_all_hops_are_imported, test/test_hop_infra.py::TestHOPInfra::test_all_hops_have_opinfo, test/test_hop_infra.py::TestHOPInfra::test_imports_from_all_work 2025-08-14T22:25:18.0850477Z 2025-08-14T22:25:22.1896043Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:25:22.1897483Z import pkg_resources 2025-08-14T22:25:22.5747494Z Running test_comparison_utils 1/1 ... [2025-08-14 22:25:22.574184] 2025-08-14T22:25:22.5748321Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:25:22.5750366Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_comparison_utils.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:25:22.574594] 2025-08-14T22:25:27.1510368Z 2025-08-14T22:25:27.1511099Z test_comparison_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_comparison_utils_1.1_9043bfc9d5d44193_.log 2025-08-14T22:25:27.1513662Z Running 7 items in this shard: test/test_comparison_utils.py::TestComparisonUtils::test_all_equal_no_assert, test/test_comparison_utils.py::TestComparisonUtils::test_all_equal_no_assert_nones, test/test_comparison_utils.py::TestComparisonUtils::test_assert_device, test/test_comparison_utils.py::TestComparisonUtils::test_assert_dtype, test/test_comparison_utils.py::TestComparisonUtils::test_assert_layout, test/test_comparison_utils.py::TestComparisonUtils::test_assert_sizes, test/test_comparison_utils.py::TestComparisonUtils::test_assert_strides 2025-08-14T22:25:27.1516070Z 2025-08-14T22:25:31.2668045Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:25:31.2669387Z import pkg_resources 2025-08-14T22:25:31.6557162Z Running functorch/test_control_flow 1/1 ... [2025-08-14 22:25:31.655235] 2025-08-14T22:25:31.6557947Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:25:31.6560306Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_control_flow.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:25:31.655684] 2025-08-14T22:25:39.3361264Z 2025-08-14T22:25:39.3362620Z functorch/test_control_flow 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_control_flow_1.1_78a84830e07b09aa_.log 2025-08-14T22:25:39.4052277Z Running 1317 items in this shard: test/functorch/test_control_flow.py::TestControlFlow::test_cond_autograd_complex, test/functorch/test_control_flow.py::TestControlFlow::test_cond_autograd_different_pytree_output, test/functorch/test_control_flow.py::TestControlFlow::test_cond_autograd_gpu, test/functorch/test_control_flow.py::TestControlFlow::test_cond_autograd_grad_through_cond, test/functorch/test_control_flow.py::TestControlFlow::test_cond_autograd_inner_fn, test/functorch/test_control_flow.py::TestControlFlow::test_cond_autograd_inner_tensor, test/functorch/test_control_flow.py::TestControlFlow::test_cond_autograd_mixed_require_grad, test/functorch/test_control_flow.py::TestControlFlow::test_cond_autograd_nested, test/functorch/test_control_flow.py::TestControlFlow::test_cond_autograd_pytree_input, test/functorch/test_control_flow.py::TestControlFlow::test_cond_autograd_pytree_not_all_inputs_used, test/functorch/test_control_flow.py::TestControlFlow::test_cond_autograd_same_pytree_output, test/functorch/test_control_flow.py::TestControlFlow::test_cond_autograd_simple, test/functorch/test_control_flow.py::TestControlFlow::test_cond_autograd_torch_nn_module, test/functorch/test_control_flow.py::TestControlFlow::test_cond_autograd_user_nn_module, test/functorch/test_control_flow.py::TestControlFlow::test_cond_autograd_zeros_unused_branch_complex_compile_fail_compile_mode_compile_dynamic_shape_scalar_False, test/functorch/test_control_flow.py::TestControlFlow::test_cond_gpu, test/functorch/test_control_flow.py::TestControlFlow::test_cond_in_forloop, test/functorch/test_control_flow.py::TestControlFlow::test_cond_no_trace, test/functorch/test_control_flow.py::TestControlFlow::test_map_autograd_higher_order, test/functorch/test_control_flow.py::TestControlFlow::test_map_autograd_nested_list, test/functorch/test_control_flow.py::TestControlFlow::test_map_autograd_no_grad_output, test/functorch/test_control_flow.py::TestControlFlow::test_map_autograd_simple, test/functorch/test_control_flow.py::TestControlFlow::test_map_autograd_simple_partial_grad, test/functorch/test_control_flow.py::TestControlFlow::test_map_dict_in_out, test/functorch/test_control_flow.py::TestControlFlow::test_map_gpu, test/functorch/test_control_flow.py::TestControlFlow::test_map_illegal_inputs, test/functorch/test_control_flow.py::TestControlFlow::test_map_illegal_outputs, test/functorch/test_control_flow.py::TestControlFlow::test_map_list_in_out, test/functorch/test_control_flow.py::TestControlFlow::test_scan_associative_scan, test/functorch/test_control_flow.py::TestControlFlow::test_scan_binary_operator_reverse_False_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_binary_operator_reverse_False_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_binary_operator_reverse_False_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_binary_operator_reverse_False_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_binary_operator_reverse_False_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_binary_operator_reverse_False_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_binary_operator_reverse_False_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_binary_operator_reverse_False_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_binary_operator_reverse_True_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_binary_operator_reverse_True_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_binary_operator_reverse_True_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_binary_operator_reverse_True_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_binary_operator_reverse_True_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_binary_operator_reverse_True_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_binary_operator_reverse_True_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_binary_operator_reverse_True_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_carry_carry_alias, test/functorch/test_control_flow.py::TestControlFlow::test_scan_carry_output_alias, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_compile_mode_eager_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_compile_mode_eager_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_compile_mode_none_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_compile_mode_none_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_eager_partial_grad_additional_inputs_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_eager_partial_grad_additional_inputs_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_eager_partial_grad_complex_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_eager_partial_grad_complex_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_eager_partial_grad_init_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_eager_partial_grad_init_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_eager_partial_grad_random_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_eager_partial_grad_random_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_eager_partial_grad_xs_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_eager_partial_grad_xs_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_none_partial_grad_additional_inputs_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_none_partial_grad_additional_inputs_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_none_partial_grad_complex_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_none_partial_grad_complex_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_none_partial_grad_init_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_none_partial_grad_init_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_none_partial_grad_random_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_none_partial_grad_random_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_none_partial_grad_xs_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_False_compile_mode_none_partial_grad_xs_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_True_compile_mode_eager_partial_grad_additional_inputs_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_True_compile_mode_eager_partial_grad_additional_inputs_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_True_compile_mode_eager_partial_grad_complex_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_True_compile_mode_eager_partial_grad_complex_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_True_compile_mode_eager_partial_grad_init_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_True_compile_mode_eager_partial_grad_init_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_True_compile_mode_eager_partial_grad_random_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_True_compile_mode_eager_partial_grad_random_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_True_compile_mode_eager_partial_grad_xs_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_True_compile_mode_eager_partial_grad_xs_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_True_compile_mode_none_partial_grad_additional_inputs_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_True_compile_mode_none_partial_grad_additional_inputs_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_True_compile_mode_none_partial_grad_complex_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_True_compile_mode_none_partial_grad_complex_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_True_compile_mode_none_partial_grad_init_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_True_compile_mode_none_partial_grad_init_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_True_compile_mode_none_partial_grad_random_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_True_compile_mode_none_partial_grad_random_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_True_compile_mode_none_partial_grad_xs_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_RNN_partial_autograd_reverse_True_compile_mode_none_partial_grad_xs_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_carries_ys_same_grad_reverse_False_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_carries_ys_same_grad_reverse_False_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_carries_ys_same_grad_reverse_False_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_carries_ys_same_grad_reverse_False_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_carries_ys_same_grad_reverse_False_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_carries_ys_same_grad_reverse_False_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_carries_ys_same_grad_reverse_False_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_carries_ys_same_grad_reverse_False_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_carries_ys_same_grad_reverse_True_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_carries_ys_same_grad_reverse_True_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_carries_ys_same_grad_reverse_True_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_carries_ys_same_grad_reverse_True_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_carries_ys_same_grad_reverse_True_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_carries_ys_same_grad_reverse_True_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_carries_ys_same_grad_reverse_True_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_carries_ys_same_grad_reverse_True_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_all_reverse_False_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_all_reverse_False_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_all_reverse_False_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_all_reverse_False_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_all_reverse_False_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_all_reverse_False_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_all_reverse_False_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_all_reverse_False_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_all_reverse_True_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_all_reverse_True_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_all_reverse_True_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_all_reverse_True_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_all_reverse_True_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_all_reverse_True_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_all_reverse_True_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_all_reverse_True_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_partial_reverse_False_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_partial_reverse_False_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_partial_reverse_False_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_partial_reverse_False_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_partial_reverse_False_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_partial_reverse_False_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_partial_reverse_False_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_partial_reverse_False_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_partial_reverse_True_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_partial_reverse_True_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_partial_reverse_True_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_partial_reverse_True_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_partial_reverse_True_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_partial_reverse_True_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_partial_reverse_True_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_additional_inputs_partial_reverse_True_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_for_out_reverse_False_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_for_out_reverse_False_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_for_out_reverse_False_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_for_out_reverse_False_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_for_out_reverse_False_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_for_out_reverse_False_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_for_out_reverse_False_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_for_out_reverse_False_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_for_out_reverse_True_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_for_out_reverse_True_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_for_out_reverse_True_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_for_out_reverse_True_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_for_out_reverse_True_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_for_out_reverse_True_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_for_out_reverse_True_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_for_out_reverse_True_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_equal_grad_reverse_False_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_equal_grad_reverse_False_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_equal_grad_reverse_False_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_equal_grad_reverse_False_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_equal_grad_reverse_False_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_equal_grad_reverse_False_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_equal_grad_reverse_False_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_equal_grad_reverse_False_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_equal_grad_reverse_True_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_equal_grad_reverse_True_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_equal_grad_reverse_True_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_equal_grad_reverse_True_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_equal_grad_reverse_True_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_equal_grad_reverse_True_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_equal_grad_reverse_True_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_equal_grad_reverse_True_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_unequal_grad_reverse_False_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_unequal_grad_reverse_False_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_unequal_grad_reverse_False_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_unequal_grad_reverse_False_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_unequal_grad_reverse_False_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_unequal_grad_reverse_False_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_unequal_grad_reverse_False_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_unequal_grad_reverse_False_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_unequal_grad_reverse_True_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_unequal_grad_reverse_True_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_unequal_grad_reverse_True_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_unequal_grad_reverse_True_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_unequal_grad_reverse_True_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_unequal_grad_reverse_True_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_unequal_grad_reverse_True_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_combine_fn_with_no_grad_init_carries_unequal_grad_reverse_True_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_nested_reverse_False_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_nested_reverse_False_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_nested_reverse_False_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_nested_reverse_False_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_nested_reverse_False_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_nested_reverse_False_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_nested_reverse_False_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_nested_reverse_False_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_nested_reverse_True_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_nested_reverse_True_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_nested_reverse_True_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_nested_reverse_True_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_nested_reverse_True_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_nested_reverse_True_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_nested_reverse_True_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_closure_nested_reverse_True_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_cnt_reverse_False_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_cnt_reverse_False_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_cnt_reverse_True_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_cnt_reverse_True_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_reverse_False_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_reverse_False_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_reverse_False_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_reverse_False_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_reverse_False_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_reverse_False_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_reverse_False_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_reverse_False_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_reverse_True_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_reverse_True_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_reverse_True_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_reverse_True_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_reverse_True_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_reverse_True_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_reverse_True_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_compile_reverse_True_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_complex_pytree_reverse_False_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_complex_pytree_reverse_False_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_complex_pytree_reverse_False_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_complex_pytree_reverse_False_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_complex_pytree_reverse_False_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_complex_pytree_reverse_False_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_complex_pytree_reverse_False_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_complex_pytree_reverse_False_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_complex_pytree_reverse_True_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_complex_pytree_reverse_True_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_complex_pytree_reverse_True_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_complex_pytree_reverse_True_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_complex_pytree_reverse_True_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_complex_pytree_reverse_True_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_complex_pytree_reverse_True_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_complex_pytree_reverse_True_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dim_reverse_False_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dim_reverse_False_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dim_reverse_False_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dim_reverse_False_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dim_reverse_False_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dim_reverse_False_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dim_reverse_False_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dim_reverse_False_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dim_reverse_True_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dim_reverse_True_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dim_reverse_True_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dim_reverse_True_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dim_reverse_True_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dim_reverse_True_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dim_reverse_True_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dim_reverse_True_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_matmul_compile_mode_eager_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_matmul_compile_mode_eager_reverse_False_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_matmul_compile_mode_eager_reverse_False_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_matmul_compile_mode_eager_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_matmul_compile_mode_eager_reverse_True_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_matmul_compile_mode_eager_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_matmul_compile_mode_eager_reverse_True_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_matmul_compile_mode_eager_reverse_True_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_matmul_compile_mode_none_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_matmul_compile_mode_none_reverse_False_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_matmul_compile_mode_none_reverse_False_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_matmul_compile_mode_none_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_matmul_compile_mode_none_reverse_True_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_matmul_compile_mode_none_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_matmul_compile_mode_none_reverse_True_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_matmul_compile_mode_none_reverse_True_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_scan_dim_compile_mode_eager_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_scan_dim_compile_mode_eager_reverse_False_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_scan_dim_compile_mode_eager_reverse_False_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_scan_dim_compile_mode_eager_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_scan_dim_compile_mode_eager_reverse_True_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_scan_dim_compile_mode_eager_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_scan_dim_compile_mode_eager_reverse_True_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_scan_dim_compile_mode_eager_reverse_True_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_scan_dim_compile_mode_none_reverse_False_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_scan_dim_compile_mode_none_reverse_False_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_scan_dim_compile_mode_none_reverse_False_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_scan_dim_compile_mode_none_reverse_False_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_scan_dim_compile_mode_none_reverse_True_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_scan_dim_compile_mode_none_reverse_True_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_scan_dim_compile_mode_none_reverse_True_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_downstream_scan_scan_dim_compile_mode_none_reverse_True_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_eager_cpu_complex64, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_eager_cpu_float16, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_eager_cpu_float32, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_eager_cpu_int32, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_eager_cpu_int64, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_eager_cuda_complex64, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_eager_cuda_float16, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_eager_cuda_float32, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_eager_cuda_int32, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_eager_cuda_int64, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_none_cpu_complex64, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_none_cpu_float16, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_none_cpu_float32, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_none_cpu_int32, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_none_cpu_int64, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_none_cuda_complex64, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_none_cuda_float16, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_none_cuda_float32, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_none_cuda_int32, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_False_compile_mode_none_cuda_int64, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_eager_cpu_complex64, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_eager_cpu_float16, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_eager_cpu_float32, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_eager_cpu_int32, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_eager_cpu_int64, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_eager_cuda_complex64, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_eager_cuda_float16, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_eager_cuda_float32, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_eager_cuda_int32, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_eager_cuda_int64, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_none_cpu_complex64, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_none_cpu_float16, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_none_cpu_float32, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_none_cpu_int32, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_none_cpu_int64, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_none_cuda_complex64, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_none_cuda_float16, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_none_cuda_float32, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_none_cuda_int32, test/functorch/test_control_flow.py::TestControlFlow::test_scan_dtype_reverse_True_compile_mode_none_cuda_int64, test/functorch/test_control_flow.py::TestControlFlow::test_scan_float_output, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_non_tensor, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_pytree_complex_reverse_False_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_pytree_complex_reverse_False_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_pytree_complex_reverse_False_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_pytree_complex_reverse_False_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_pytree_complex_reverse_False_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_pytree_complex_reverse_False_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_pytree_complex_reverse_False_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_pytree_complex_reverse_False_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_pytree_complex_reverse_True_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_pytree_complex_reverse_True_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_pytree_complex_reverse_True_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_pytree_complex_reverse_True_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_pytree_complex_reverse_True_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_pytree_complex_reverse_True_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_pytree_complex_reverse_True_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_pytree_complex_reverse_True_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_reverse_False_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_reverse_False_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_reverse_False_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_reverse_False_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_reverse_False_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_reverse_False_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_reverse_False_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_reverse_False_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_reverse_True_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_reverse_True_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_reverse_True_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_reverse_True_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_reverse_True_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_reverse_True_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_reverse_True_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_reverse_True_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_scanned_0, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_wrong_pytree_carry_shape, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_wrong_pytree_complex_reverse_False_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_wrong_pytree_complex_reverse_False_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_wrong_pytree_complex_reverse_True_cpu, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_wrong_pytree_complex_reverse_True_cuda, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_wrong_pytree_init_longer_carry, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_wrong_pytree_init_shorter_carry, test/functorch/test_control_flow.py::TestControlFlow::test_scan_init_wrong_shape, test/functorch/test_control_flow.py::TestControlFlow::test_scan_input_carry_alias, test/functorch/test_control_flow.py::TestControlFlow::test_scan_input_mutation, test/functorch/test_control_flow.py::TestControlFlow::test_scan_input_output_alias, test/functorch/test_control_flow.py::TestControlFlow::test_scan_non_pointwise_reverse_False_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_non_pointwise_reverse_False_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_non_pointwise_reverse_False_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_non_pointwise_reverse_False_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_non_pointwise_reverse_False_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_non_pointwise_reverse_False_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_non_pointwise_reverse_False_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_non_pointwise_reverse_False_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_non_pointwise_reverse_True_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_non_pointwise_reverse_True_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_non_pointwise_reverse_True_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_non_pointwise_reverse_True_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_non_pointwise_reverse_True_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_non_pointwise_reverse_True_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_non_pointwise_reverse_True_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_non_pointwise_reverse_True_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_one_return, test/functorch/test_control_flow.py::TestControlFlow::test_scan_pytree_output, test/functorch/test_control_flow.py::TestControlFlow::test_scan_simple_graph, test/functorch/test_control_flow.py::TestControlFlow::test_scan_simple_graph_wrong_dtype, test/functorch/test_control_flow.py::TestControlFlow::test_scan_tuple_reverse_False_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_tuple_reverse_False_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_tuple_reverse_False_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_tuple_reverse_False_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_tuple_reverse_False_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_tuple_reverse_False_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_tuple_reverse_False_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_tuple_reverse_False_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_tuple_reverse_True_compile_mode_eager_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_tuple_reverse_True_compile_mode_eager_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_tuple_reverse_True_compile_mode_eager_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_tuple_reverse_True_compile_mode_eager_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_tuple_reverse_True_compile_mode_none_cpu_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_tuple_reverse_True_compile_mode_none_cpu_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_tuple_reverse_True_compile_mode_none_cuda_autograd_False, test/functorch/test_control_flow.py::TestControlFlow::test_scan_tuple_reverse_True_compile_mode_none_cuda_autograd_True, test/functorch/test_control_flow.py::TestControlFlow::test_scan_wrong_pytree, test/functorch/test_control_flow.py::TestControlFlow::test_scan_y_less_ndim_then_dim, test/functorch/test_control_flow.py::TestControlFlow::test_while_loop_gpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_compile_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_compile_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_compile_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_compile_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_compile_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_compile_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_compile_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_compile_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_eager_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_eager_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_eager_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_eager_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_eager_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_eager_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_eager_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_eager_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_none_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_none_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_none_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_none_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_none_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_none_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_none_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_binary_operator_compile_mode_none_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_combine_fn_wrong_meta_in_combine_fn, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_compile_combine_mode_generic_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_compile_combine_mode_generic_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_compile_combine_mode_pointwise_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_compile_combine_mode_pointwise_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_compile_dynamic_shape_combine_mode_generic_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_compile_dynamic_shape_combine_mode_generic_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_compile_dynamic_shape_combine_mode_pointwise_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_compile_dynamic_shape_combine_mode_pointwise_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_eager_combine_mode_generic_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_eager_combine_mode_generic_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_eager_combine_mode_pointwise_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_eager_combine_mode_pointwise_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_none_combine_mode_generic_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_none_combine_mode_generic_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_none_combine_mode_pointwise_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_False_compile_mode_none_combine_mode_pointwise_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_True_compile_mode_compile_combine_mode_generic_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_True_compile_mode_compile_combine_mode_generic_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_True_compile_mode_compile_combine_mode_pointwise_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_True_compile_mode_compile_combine_mode_pointwise_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_True_compile_mode_compile_dynamic_shape_combine_mode_generic_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_True_compile_mode_compile_dynamic_shape_combine_mode_generic_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_True_compile_mode_compile_dynamic_shape_combine_mode_pointwise_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_True_compile_mode_compile_dynamic_shape_combine_mode_pointwise_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_True_compile_mode_eager_combine_mode_generic_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_True_compile_mode_eager_combine_mode_generic_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_True_compile_mode_eager_combine_mode_pointwise_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_True_compile_mode_eager_combine_mode_pointwise_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_True_compile_mode_none_combine_mode_generic_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_True_compile_mode_none_combine_mode_generic_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_True_compile_mode_none_combine_mode_pointwise_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_compile_reverse_True_compile_mode_none_combine_mode_pointwise_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_compile_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_compile_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_compile_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_compile_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_compile_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_compile_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_compile_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_compile_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_eager_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_eager_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_eager_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_eager_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_eager_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_eager_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_eager_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_eager_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_none_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_none_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_none_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_none_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_none_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_none_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_none_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_complex_pytree_compile_mode_none_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_compile_dynamic_shape_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_compile_dynamic_shape_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_compile_dynamic_shape_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_compile_dynamic_shape_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_compile_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_compile_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_compile_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_compile_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_eager_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_eager_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_eager_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_eager_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_none_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_none_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_none_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_cond_in_combine_fn_compile_mode_none_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_different_input_size_compile_mode_compile_dynamic_shape_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_different_input_size_compile_mode_compile_dynamic_shape_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_different_input_size_compile_mode_compile_dynamic_shape_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_different_input_size_compile_mode_compile_dynamic_shape_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_different_input_size_compile_mode_compile_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_different_input_size_compile_mode_compile_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_different_input_size_compile_mode_compile_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_different_input_size_compile_mode_compile_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_different_input_size_compile_mode_eager_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_different_input_size_compile_mode_eager_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_different_input_size_compile_mode_eager_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_different_input_size_compile_mode_eager_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_different_input_size_compile_mode_none_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_different_input_size_compile_mode_none_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_different_input_size_compile_mode_none_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_different_input_size_compile_mode_none_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_different_input_size_wrong_dim, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_compile_combine_mode_generic_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_compile_combine_mode_generic_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_compile_combine_mode_pointwise_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_compile_combine_mode_pointwise_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_compile_dynamic_shape_combine_mode_generic_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_compile_dynamic_shape_combine_mode_generic_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_compile_dynamic_shape_combine_mode_pointwise_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_compile_dynamic_shape_combine_mode_pointwise_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_eager_combine_mode_generic_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_eager_combine_mode_generic_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_eager_combine_mode_pointwise_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_eager_combine_mode_pointwise_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_none_combine_mode_generic_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_none_combine_mode_generic_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_none_combine_mode_pointwise_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_False_compile_mode_none_combine_mode_pointwise_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_True_compile_mode_compile_combine_mode_generic_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_True_compile_mode_compile_combine_mode_generic_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_True_compile_mode_compile_combine_mode_pointwise_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_True_compile_mode_compile_combine_mode_pointwise_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_True_compile_mode_compile_dynamic_shape_combine_mode_generic_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_True_compile_mode_compile_dynamic_shape_combine_mode_generic_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_True_compile_mode_compile_dynamic_shape_combine_mode_pointwise_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_True_compile_mode_compile_dynamic_shape_combine_mode_pointwise_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_True_compile_mode_eager_combine_mode_generic_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_True_compile_mode_eager_combine_mode_generic_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_True_compile_mode_eager_combine_mode_pointwise_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_True_compile_mode_eager_combine_mode_pointwise_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_True_compile_mode_none_combine_mode_generic_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_True_compile_mode_none_combine_mode_generic_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_True_compile_mode_none_combine_mode_pointwise_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_reverse_True_compile_mode_none_combine_mode_pointwise_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_dim_shape_failure, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_generic_compile_mode_compile_dynamic_shape_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_generic_compile_mode_compile_dynamic_shape_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_generic_compile_mode_compile_dynamic_shape_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_generic_compile_mode_compile_dynamic_shape_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_generic_compile_mode_compile_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_generic_compile_mode_compile_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_generic_compile_mode_compile_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_generic_compile_mode_compile_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_generic_compile_mode_eager_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_generic_compile_mode_eager_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_generic_compile_mode_eager_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_generic_compile_mode_eager_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_generic_compile_mode_none_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_generic_compile_mode_none_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_generic_compile_mode_none_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_generic_compile_mode_none_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_compile_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_compile_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_compile_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_compile_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_eager_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_eager_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_eager_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_eager_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_none_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_none_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_none_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_matmul_combine_mode_pointwise_compile_mode_none_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_generic_compile_mode_compile_dynamic_shape_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_generic_compile_mode_compile_dynamic_shape_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_generic_compile_mode_compile_dynamic_shape_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_generic_compile_mode_compile_dynamic_shape_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_generic_compile_mode_compile_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_generic_compile_mode_compile_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_generic_compile_mode_compile_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_generic_compile_mode_compile_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_generic_compile_mode_eager_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_generic_compile_mode_eager_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_generic_compile_mode_eager_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_generic_compile_mode_eager_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_generic_compile_mode_none_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_generic_compile_mode_none_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_generic_compile_mode_none_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_generic_compile_mode_none_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_compile_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_compile_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_compile_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_compile_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_eager_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_eager_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_eager_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_eager_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_none_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_none_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_none_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_combine_mode_pointwise_compile_mode_none_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_compile_dynamic_shape_reverse_first_False_same_direction_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_compile_dynamic_shape_reverse_first_False_same_direction_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_compile_dynamic_shape_reverse_first_False_same_direction_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_compile_dynamic_shape_reverse_first_False_same_direction_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_compile_dynamic_shape_reverse_first_True_same_direction_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_compile_dynamic_shape_reverse_first_True_same_direction_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_compile_dynamic_shape_reverse_first_True_same_direction_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_compile_dynamic_shape_reverse_first_True_same_direction_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_compile_reverse_first_False_same_direction_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_compile_reverse_first_False_same_direction_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_compile_reverse_first_False_same_direction_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_compile_reverse_first_False_same_direction_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_compile_reverse_first_True_same_direction_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_compile_reverse_first_True_same_direction_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_compile_reverse_first_True_same_direction_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_compile_reverse_first_True_same_direction_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_eager_reverse_first_False_same_direction_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_eager_reverse_first_False_same_direction_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_eager_reverse_first_False_same_direction_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_eager_reverse_first_False_same_direction_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_eager_reverse_first_True_same_direction_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_eager_reverse_first_True_same_direction_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_eager_reverse_first_True_same_direction_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_eager_reverse_first_True_same_direction_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_none_reverse_first_False_same_direction_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_none_reverse_first_False_same_direction_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_none_reverse_first_False_same_direction_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_none_reverse_first_False_same_direction_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_none_reverse_first_True_same_direction_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_none_reverse_first_True_same_direction_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_none_reverse_first_True_same_direction_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_generic_compile_mode_none_reverse_first_True_same_direction_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_first_False_same_direction_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_first_False_same_direction_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_first_False_same_direction_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_first_False_same_direction_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_first_True_same_direction_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_first_True_same_direction_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_first_True_same_direction_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_compile_dynamic_shape_reverse_first_True_same_direction_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_compile_reverse_first_False_same_direction_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_compile_reverse_first_False_same_direction_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_compile_reverse_first_False_same_direction_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_compile_reverse_first_False_same_direction_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_compile_reverse_first_True_same_direction_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_compile_reverse_first_True_same_direction_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_compile_reverse_first_True_same_direction_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_compile_reverse_first_True_same_direction_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_eager_reverse_first_False_same_direction_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_eager_reverse_first_False_same_direction_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_eager_reverse_first_False_same_direction_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_eager_reverse_first_False_same_direction_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_eager_reverse_first_True_same_direction_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_eager_reverse_first_True_same_direction_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_eager_reverse_first_True_same_direction_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_eager_reverse_first_True_same_direction_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_none_reverse_first_False_same_direction_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_none_reverse_first_False_same_direction_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_none_reverse_first_False_same_direction_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_none_reverse_first_False_same_direction_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_none_reverse_first_True_same_direction_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_none_reverse_first_True_same_direction_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_none_reverse_first_True_same_direction_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_downstream_scan_scan_different_dim_combine_mode_pointwise_compile_mode_none_reverse_first_True_same_direction_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_compile_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_compile_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_compile_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_compile_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_compile_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_compile_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_compile_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_compile_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_eager_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_eager_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_eager_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_eager_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_eager_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_eager_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_eager_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_eager_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_none_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_none_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_none_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_none_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_none_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_none_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_none_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_expand_in_combine_fn_compile_mode_none_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_compile_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_compile_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_compile_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_compile_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_compile_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_compile_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_compile_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_compile_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_eager_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_eager_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_eager_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_eager_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_eager_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_eager_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_eager_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_eager_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_none_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_none_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_none_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_none_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_none_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_none_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_none_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_compile_mode_none_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_generic_compile_mode_compile_dynamic_shape_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_generic_compile_mode_compile_dynamic_shape_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_generic_compile_mode_compile_dynamic_shape_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_generic_compile_mode_compile_dynamic_shape_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_generic_compile_mode_compile_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_generic_compile_mode_compile_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_generic_compile_mode_compile_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_generic_compile_mode_compile_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_generic_compile_mode_eager_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_generic_compile_mode_eager_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_generic_compile_mode_eager_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_generic_compile_mode_eager_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_generic_compile_mode_none_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_generic_compile_mode_none_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_generic_compile_mode_none_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_fct_generic_compile_mode_none_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_compile_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_compile_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_compile_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_compile_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_compile_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_compile_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_compile_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_compile_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_eager_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_eager_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_eager_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_eager_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_eager_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_eager_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_eager_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_eager_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_none_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_none_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_none_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_none_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_none_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_none_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_none_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_nested_compile_mode_none_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_compile_dynamic_shape_reverse_False_cpu_combine_mode_generic, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_compile_dynamic_shape_reverse_False_cpu_combine_mode_pointwise, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_compile_dynamic_shape_reverse_False_cuda_combine_mode_generic, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_compile_dynamic_shape_reverse_False_cuda_combine_mode_pointwise, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_compile_dynamic_shape_reverse_True_cpu_combine_mode_generic, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_compile_dynamic_shape_reverse_True_cpu_combine_mode_pointwise, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_compile_dynamic_shape_reverse_True_cuda_combine_mode_generic, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_compile_dynamic_shape_reverse_True_cuda_combine_mode_pointwise, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_compile_reverse_False_cpu_combine_mode_generic, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_compile_reverse_False_cpu_combine_mode_pointwise, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_compile_reverse_False_cuda_combine_mode_generic, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_compile_reverse_False_cuda_combine_mode_pointwise, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_compile_reverse_True_cpu_combine_mode_generic, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_compile_reverse_True_cpu_combine_mode_pointwise, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_compile_reverse_True_cuda_combine_mode_generic, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_compile_reverse_True_cuda_combine_mode_pointwise, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_eager_reverse_False_cpu_combine_mode_generic, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_eager_reverse_False_cpu_combine_mode_pointwise, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_eager_reverse_False_cuda_combine_mode_generic, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_eager_reverse_False_cuda_combine_mode_pointwise, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_eager_reverse_True_cpu_combine_mode_generic, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_eager_reverse_True_cpu_combine_mode_pointwise, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_eager_reverse_True_cuda_combine_mode_generic, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_eager_reverse_True_cuda_combine_mode_pointwise, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_none_reverse_False_cpu_combine_mode_generic, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_none_reverse_False_cpu_combine_mode_pointwise, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_none_reverse_False_cuda_combine_mode_generic, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_none_reverse_False_cuda_combine_mode_pointwise, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_none_reverse_True_cpu_combine_mode_generic, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_none_reverse_True_cpu_combine_mode_pointwise, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_none_reverse_True_cuda_combine_mode_generic, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_pytree_compile_mode_none_reverse_True_cuda_combine_mode_pointwise, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_compile_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_compile_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_compile_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_compile_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_compile_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_compile_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_compile_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_compile_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_eager_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_eager_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_eager_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_eager_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_eager_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_eager_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_eager_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_eager_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_none_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_none_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_none_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_none_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_none_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_none_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_none_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_shape_check_compile_mode_none_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_compile_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_compile_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_compile_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_compile_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_compile_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_compile_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_compile_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_compile_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_eager_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_eager_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_eager_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_eager_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_eager_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_eager_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_eager_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_eager_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_none_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_none_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_none_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_none_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_none_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_none_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_none_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_freevars_simple_compile_mode_none_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_input_mutation, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_input_output_alias, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_compile_dynamic_shape_loop_type_for_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_compile_dynamic_shape_loop_type_for_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_compile_dynamic_shape_loop_type_for_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_compile_dynamic_shape_loop_type_for_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_compile_loop_type_for_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_compile_loop_type_for_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_compile_loop_type_for_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_compile_loop_type_for_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_eager_loop_type_for_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_eager_loop_type_for_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_eager_loop_type_for_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_eager_loop_type_for_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_none_loop_type_for_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_none_loop_type_for_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_none_loop_type_for_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_compile_mode_none_loop_type_for_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_loop_in_combine_fn_failure, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_map_in_combine_fn, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_nested, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_contiguous_tensor_compile_mode_compile_dynamic_shape_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_contiguous_tensor_compile_mode_compile_dynamic_shape_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_contiguous_tensor_compile_mode_compile_dynamic_shape_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_contiguous_tensor_compile_mode_compile_dynamic_shape_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_contiguous_tensor_compile_mode_compile_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_contiguous_tensor_compile_mode_compile_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_contiguous_tensor_compile_mode_compile_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_contiguous_tensor_compile_mode_compile_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_contiguous_tensor_compile_mode_eager_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_contiguous_tensor_compile_mode_eager_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_contiguous_tensor_compile_mode_eager_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_contiguous_tensor_compile_mode_eager_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_contiguous_tensor_compile_mode_none_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_contiguous_tensor_compile_mode_none_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_contiguous_tensor_compile_mode_none_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_contiguous_tensor_compile_mode_none_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise_generic_reverse_False_compile_mode_compile_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise_generic_reverse_False_compile_mode_compile_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise_generic_reverse_False_compile_mode_compile_dynamic_shape_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise_generic_reverse_False_compile_mode_compile_dynamic_shape_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise_generic_reverse_False_compile_mode_eager_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise_generic_reverse_False_compile_mode_eager_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise_generic_reverse_False_compile_mode_none_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise_generic_reverse_False_compile_mode_none_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise_generic_reverse_True_compile_mode_compile_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise_generic_reverse_True_compile_mode_compile_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise_generic_reverse_True_compile_mode_compile_dynamic_shape_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise_generic_reverse_True_compile_mode_compile_dynamic_shape_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise_generic_reverse_True_compile_mode_eager_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise_generic_reverse_True_compile_mode_eager_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise_generic_reverse_True_compile_mode_none_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_non_pointwise_generic_reverse_True_compile_mode_none_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_output_output_alias, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_pytree_output, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_sparse_tensor, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_compile_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_compile_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_compile_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_compile_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_compile_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_compile_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_compile_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_compile_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_compile_dynamic_shape_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_compile_dynamic_shape_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_eager_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_eager_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_eager_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_eager_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_eager_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_eager_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_eager_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_eager_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_none_combine_mode_generic_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_none_combine_mode_generic_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_none_combine_mode_generic_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_none_combine_mode_generic_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_none_combine_mode_pointwise_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_none_combine_mode_pointwise_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_none_combine_mode_pointwise_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_tuple_compile_mode_none_combine_mode_pointwise_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_vmap_in_combine_fn_compile_mode_compile_dynamic_shape_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_vmap_in_combine_fn_compile_mode_compile_dynamic_shape_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_vmap_in_combine_fn_compile_mode_compile_dynamic_shape_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_vmap_in_combine_fn_compile_mode_compile_dynamic_shape_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_vmap_in_combine_fn_compile_mode_compile_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_vmap_in_combine_fn_compile_mode_compile_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_vmap_in_combine_fn_compile_mode_compile_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_vmap_in_combine_fn_compile_mode_compile_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_vmap_in_combine_fn_compile_mode_eager_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_vmap_in_combine_fn_compile_mode_eager_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_vmap_in_combine_fn_compile_mode_eager_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_vmap_in_combine_fn_compile_mode_eager_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_vmap_in_combine_fn_compile_mode_none_reverse_False_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_vmap_in_combine_fn_compile_mode_none_reverse_False_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_vmap_in_combine_fn_compile_mode_none_reverse_True_cpu, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_vmap_in_combine_fn_compile_mode_none_reverse_True_cuda, test/functorch/test_control_flow.py::AssociativeScanTests::test_associative_scan_wrong_pytree, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_accepts_torch_function_as_inputs, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_autograd_backward, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_eager_run_with_item, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_functionalized, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_functionalized_aot_func_check_functional, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_functionalized_data_dependent_pred, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_functionalized_input_aliasing_with_aot_func, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_functionalized_input_mutation_on_false_branch, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_functionalized_input_mutation_on_true_branch, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_functionalized_nested, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_functionalized_nested_input_mutation, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_functionalized_nested_input_mutation_with_aot_func, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_functionalized_output_alias_input, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_make_fx_preserve_stack_trace_for_nodes_in_subgraph, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_merge_graph_preserves_ph_meta, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_mismatched_branch_output_dynamic_False_backend_aot_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_mismatched_branch_output_dynamic_False_backend_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_mismatched_branch_output_dynamic_True_backend_aot_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_mismatched_branch_output_dynamic_True_backend_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_mismatched_branch_strided_output_dynamic_False_backend_aot_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_mismatched_branch_strided_output_dynamic_False_backend_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_mismatched_branch_strided_output_dynamic_True_backend_aot_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_mismatched_branch_strided_output_dynamic_True_backend_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_nested_traced, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_nested_traced_fake_tensor, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_nested_traced_multi, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_nested_traced_multi_fake_tensor, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_nested_traced_other_inputs, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_nested_traced_other_inputs_fake_tensor, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_nested_with_closure, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_nested_with_closure_graph_module, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_no_dynamo_cache_limit, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_retrace_functionalized, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_simple_with_linear_compile_check_graph, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_subgraph_same_shape_env_as_parent, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_symint_closure, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_symint_operands_requires_grad_False, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_symint_operands_requires_grad_True, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_trace_set__and_mutate_input, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_trace_set__and_mutate_intermediate, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_traced_not_nested, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_traced_not_nested_fake_tensor, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_function_nOperands_0_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_function_nOperands_0_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_function_nOperands_0_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_function_nOperands_0_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_function_nOperands_1_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_function_nOperands_1_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_function_nOperands_1_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_function_nOperands_1_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_module_nOperands_0_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_module_nOperands_0_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_module_nOperands_0_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_module_nOperands_0_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_module_nOperands_1_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_module_nOperands_1_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_module_nOperands_1_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_module_nOperands_1_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_object_nOperands_0_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_object_nOperands_0_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_object_nOperands_0_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_object_nOperands_0_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_object_nOperands_1_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_object_nOperands_1_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_object_nOperands_1_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_boolTensor_innerFnType_object_nOperands_1_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_function_nOperands_0_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_function_nOperands_0_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_function_nOperands_0_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_function_nOperands_0_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_function_nOperands_1_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_function_nOperands_1_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_function_nOperands_1_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_function_nOperands_1_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_module_nOperands_0_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_module_nOperands_0_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_module_nOperands_0_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_module_nOperands_0_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_module_nOperands_1_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_module_nOperands_1_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_module_nOperands_1_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_module_nOperands_1_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_object_nOperands_0_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_object_nOperands_0_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_object_nOperands_0_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_object_nOperands_0_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_object_nOperands_1_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_object_nOperands_1_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_object_nOperands_1_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_bool_innerFnType_object_nOperands_1_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_function_nOperands_0_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_function_nOperands_0_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_function_nOperands_0_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_function_nOperands_0_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_function_nOperands_1_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_function_nOperands_1_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_function_nOperands_1_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_function_nOperands_1_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_module_nOperands_0_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_module_nOperands_0_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_module_nOperands_0_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_module_nOperands_0_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_module_nOperands_1_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_module_nOperands_1_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_module_nOperands_1_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_module_nOperands_1_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_object_nOperands_0_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_object_nOperands_0_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_object_nOperands_0_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_object_nOperands_0_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_object_nOperands_1_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_object_nOperands_1_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_object_nOperands_1_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_floatTensor_innerFnType_object_nOperands_1_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_function_nOperands_0_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_function_nOperands_0_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_function_nOperands_0_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_function_nOperands_0_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_function_nOperands_1_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_function_nOperands_1_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_function_nOperands_1_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_function_nOperands_1_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_module_nOperands_0_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_module_nOperands_0_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_module_nOperands_0_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_module_nOperands_0_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_module_nOperands_1_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_module_nOperands_1_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_module_nOperands_1_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_module_nOperands_1_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_object_nOperands_0_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_object_nOperands_0_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_object_nOperands_0_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_object_nOperands_0_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_object_nOperands_1_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_object_nOperands_1_nClosure_0_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_object_nOperands_1_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_tracing_with_valid_inputs_predType_intTensor_innerFnType_object_nOperands_1_nClosure_1_nesting_2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_unbacked_symint_closure, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_vmap_multiple_args_with_closure, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_vmap_multiple_inputs, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_vmap_multiple_outputs_nClosure_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_vmap_multiple_outputs_nClosure_1, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_vmap_predType_boolTensor_innerFnType_function_nOperands_1_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_vmap_predType_boolTensor_innerFnType_function_nOperands_1_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_vmap_predType_boolTensor_innerFnType_function_nOperands_2_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_vmap_predType_boolTensor_innerFnType_function_nOperands_2_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_vmap_predType_boolTensor_innerFnType_module_nOperands_1_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_vmap_predType_boolTensor_innerFnType_module_nOperands_1_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_vmap_predType_boolTensor_innerFnType_module_nOperands_2_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_vmap_predType_boolTensor_innerFnType_module_nOperands_2_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_vmap_predType_boolTensor_innerFnType_object_nOperands_1_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_vmap_predType_boolTensor_innerFnType_object_nOperands_1_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_vmap_predType_boolTensor_innerFnType_object_nOperands_2_nClosure_0_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_vmap_predType_boolTensor_innerFnType_object_nOperands_2_nClosure_1_nesting_0, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_vmap_simple, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_vmap_single_input_with_closure, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_with_consecutive_make_fx_symbolic, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_with_module_param_closure, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_with_module_python_scalar_closure, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_with_sym_pred, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_with_tensor_closure, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_with_tensor_closure_graph_module, test/functorch/test_control_flow.py::TestControlFlowTraced::test_cond_with_unbacked_sym_pred, test/functorch/test_control_flow.py::TestControlFlowTraced::test_hop_raises_if_not_overriding_call, test/functorch/test_control_flow.py::TestControlFlowTraced::test_input_input_alias, test/functorch/test_control_flow.py::TestControlFlowTraced::test_input_mutation_inference_mode_False, test/functorch/test_control_flow.py::TestControlFlowTraced::test_input_mutation_inference_mode_True, test/functorch/test_control_flow.py::TestControlFlowTraced::test_input_output_alias, test/functorch/test_control_flow.py::TestControlFlowTraced::test_map_functionalized, test/functorch/test_control_flow.py::TestControlFlowTraced::test_map_functionalized_aot_func, test/functorch/test_control_flow.py::TestControlFlowTraced::test_map_functionalized_arg_mutation, test/functorch/test_control_flow.py::TestControlFlowTraced::test_map_functionalized_elem_alias, test/functorch/test_control_flow.py::TestControlFlowTraced::test_map_functionalized_elem_mutation, test/functorch/test_control_flow.py::TestControlFlowTraced::test_map_unfunc_boolean_tensor_for_nested_map_cond, test/functorch/test_control_flow.py::TestControlFlowTraced::test_merge_output, test/functorch/test_control_flow.py::TestControlFlowTraced::test_nested_cond_map_cond_symbolic, test/functorch/test_control_flow.py::TestControlFlowTraced::test_nested_map_cond_real, test/functorch/test_control_flow.py::TestControlFlowTraced::test_nested_map_cond_symbolic, test/functorch/test_control_flow.py::TestControlFlowTraced::test_raise_error_on_mismatch_tensor_size, test/functorch/test_control_flow.py::TestControlFlowTraced::test_raise_error_on_mismatch_tensor_size_fake_tensor, test/functorch/test_control_flow.py::TestControlFlowTraced::test_raise_error_on_mismatch_type_size, test/functorch/test_control_flow.py::TestControlFlowTraced::test_raise_error_on_mismatch_type_size_fake_tensor, test/functorch/test_control_flow.py::TestControlFlowTraced::test_scan_functionalized, test/functorch/test_control_flow.py::TestControlFlowTraced::test_scan_functionalized_elem_alias, test/functorch/test_control_flow.py::TestControlFlowTraced::test_scan_functionalized_elem_mutation, test/functorch/test_control_flow.py::TestControlFlowTraced::test_scan_pytree_closure, test/functorch/test_control_flow.py::TestControlFlowTraced::test_tracing_map_autograd_aot_functionalized, test/functorch/test_control_flow.py::TestControlFlowTraced::test_tracing_map_autograd_symbolic_dict, test/functorch/test_control_flow.py::TestControlFlowTraced::test_tracing_map_autograd_symbolic_list, test/functorch/test_control_flow.py::TestControlFlowTraced::test_tracing_map_autograd_symbolic_simple, test/functorch/test_control_flow.py::TestControlFlowTraced::test_tracing_map_real, test/functorch/test_control_flow.py::TestControlFlowTraced::test_tracing_map_symbolic_dict, test/functorch/test_control_flow.py::TestControlFlowTraced::test_tracing_map_symbolic_list, test/functorch/test_control_flow.py::TestControlFlowTraced::test_tracing_map_symbolic_simple, test/functorch/test_control_flow.py::TestControlFlowTraced::test_two_hops_not_sharing_code_obj, test/functorch/test_control_flow.py::TestControlFlowTraced::test_vmap_vmap_boolcond_False, test/functorch/test_control_flow.py::TestControlFlowTraced::test_vmap_vmap_boolcond_True, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_aot_eager_while_loop_test_const_and_symint_output, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_aot_eager_while_loop_test_int_carry, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_aot_eager_while_loop_test_nested, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_aot_eager_while_loop_test_nested2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_aot_eager_while_loop_test_nested_with_linear, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_aot_eager_while_loop_test_pytree_int_carry, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_aot_eager_while_loop_test_simple, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_aot_eager_while_loop_test_simple_with_linear, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_aot_eager_while_loop_test_simple_with_mutation, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_aot_eager_while_loop_test_simple_with_pytree_carry, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_eager_while_loop_test_const_and_symint_output, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_eager_while_loop_test_int_carry, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_eager_while_loop_test_nested, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_eager_while_loop_test_nested2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_eager_while_loop_test_nested_with_linear, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_eager_while_loop_test_pytree_int_carry, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_eager_while_loop_test_simple, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_eager_while_loop_test_simple_with_linear, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_eager_while_loop_test_simple_with_mutation, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_compile_backend_eager_while_loop_test_simple_with_pytree_carry, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_functionalize_func_type_cpp_while_loop_test_nested, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_functionalize_func_type_cpp_while_loop_test_nested2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_functionalize_func_type_cpp_while_loop_test_simple, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_functionalize_func_type_cpp_while_loop_test_simple_with_mutation, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_functionalize_func_type_cpp_while_loop_test_simple_with_pytree_carry, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_functionalize_func_type_functorch_while_loop_test_nested, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_functionalize_func_type_functorch_while_loop_test_nested2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_functionalize_func_type_functorch_while_loop_test_simple, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_functionalize_func_type_functorch_while_loop_test_simple_with_mutation, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_functionalize_func_type_functorch_while_loop_test_simple_with_pytree_carry, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_functionalize_func_type_no_while_loop_test_nested, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_functionalize_func_type_no_while_loop_test_nested2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_functionalize_func_type_no_while_loop_test_simple, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_functionalize_func_type_no_while_loop_test_simple_with_mutation, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_functionalize_func_type_no_while_loop_test_simple_with_pytree_carry, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_functionalize_func_type_python_while_loop_test_nested, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_functionalize_func_type_python_while_loop_test_nested2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_functionalize_func_type_python_while_loop_test_simple, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_functionalize_func_type_python_while_loop_test_simple_with_mutation, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_functionalize_func_type_python_while_loop_test_simple_with_pytree_carry, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_nested2_traced, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_nested_traced, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_constant_and_symint_output_compile_dynamic_False_backend_aot_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_constant_and_symint_output_compile_dynamic_False_backend_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_constant_and_symint_output_compile_dynamic_True_backend_aot_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_constant_and_symint_output_compile_dynamic_True_backend_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_constant_and_symint_output_export_strict_False_dynamic_False, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_constant_and_symint_output_export_strict_False_dynamic_True, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_constant_and_symint_output_export_strict_True_dynamic_False, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_constant_and_symint_output_export_strict_True_dynamic_True, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_int_carry_compile_dynamic_False_backend_aot_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_int_carry_compile_dynamic_False_backend_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_int_carry_compile_dynamic_True_backend_aot_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_int_carry_compile_dynamic_True_backend_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_int_carry_export_strict_False_dynamic_False, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_int_carry_export_strict_False_dynamic_True, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_int_carry_export_strict_True_dynamic_False, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_int_carry_export_strict_True_dynamic_True, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_mismatch_in_meta, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_pytree_int_carry_compile_dynamic_False_backend_aot_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_pytree_int_carry_compile_dynamic_False_backend_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_pytree_int_carry_compile_dynamic_True_backend_aot_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_pytree_int_carry_compile_dynamic_True_backend_eager, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_pytree_int_carry_export_strict_False_dynamic_False, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_pytree_int_carry_export_strict_False_dynamic_True, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_pytree_int_carry_export_strict_True_dynamic_False, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_op_pytree_int_carry_export_strict_True_dynamic_True, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_pytree_carry, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_simple_functionalize_check_graph_func_type_cpp, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_simple_functionalize_check_graph_func_type_functorch, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_simple_functionalize_check_graph_func_type_no, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_simple_functionalize_check_graph_func_type_python, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_simple_with_linear_compile_check_graph, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_tracing_while_loop_test_nested, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_tracing_while_loop_test_nested2, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_tracing_while_loop_test_nested_with_linear, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_tracing_while_loop_test_simple, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_tracing_while_loop_test_simple_with_linear, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_tracing_while_loop_test_simple_with_mutation, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_tracing_while_loop_test_simple_with_pytree_carry, test/functorch/test_control_flow.py::TestControlFlowTraced::test_while_loop_unbacked_bindings, test/functorch/test_control_flow.py::TestHopSchema::test_function_schema_gen, test/functorch/test_control_flow.py::TestHopSchema::test_list_gen_schema_type_GraphModule, test/functorch/test_control_flow.py::TestHopSchema::test_list_gen_schema_type_ScriptObj, test/functorch/test_control_flow.py::TestHopSchema::test_list_gen_schema_type_SymBool, test/functorch/test_control_flow.py::TestHopSchema::test_list_gen_schema_type_SymInt, test/functorch/test_control_flow.py::TestHopSchema::test_list_gen_schema_type_Tensor, test/functorch/test_control_flow.py::TestHopSchema::test_list_gen_schema_type_bool, test/functorch/test_control_flow.py::TestHopSchema::test_list_gen_schema_type_float, test/functorch/test_control_flow.py::TestHopSchema::test_list_gen_schema_type_int, test/functorch/test_control_flow.py::TestHopSchema::test_list_gen_schema_type_str, test/functorch/test_control_flow.py::TestHopSchema::test_schema_tree_spec, test/functorch/test_control_flow.py::TestHopSchema::test_type_gen_schema_type_GraphModule, test/functorch/test_control_flow.py::TestHopSchema::test_type_gen_schema_type_ScriptObj, test/functorch/test_control_flow.py::TestHopSchema::test_type_gen_schema_type_SymBool, test/functorch/test_control_flow.py::TestHopSchema::test_type_gen_schema_type_SymInt, test/functorch/test_control_flow.py::TestHopSchema::test_type_gen_schema_type_Tensor, test/functorch/test_control_flow.py::TestHopSchema::test_type_gen_schema_type_bool, test/functorch/test_control_flow.py::TestHopSchema::test_type_gen_schema_type_float, test/functorch/test_control_flow.py::TestHopSchema::test_type_gen_schema_type_int, test/functorch/test_control_flow.py::TestHopSchema::test_type_gen_schema_type_str, test/functorch/test_control_flow.py::TestHopSchema::test_while_loop_schema_gen 2025-08-14T22:25:39.4710468Z 2025-08-14T22:25:43.5174167Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:25:43.5175507Z import pkg_resources 2025-08-14T22:25:43.9017688Z Running functorch/test_ac_logging 1/1 ... [2025-08-14 22:25:43.901232] 2025-08-14T22:25:43.9018456Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:25:43.9020172Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_ac_logging.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:25:43.901647] 2025-08-14T22:25:48.2750597Z 2025-08-14T22:25:48.2751757Z functorch/test_ac_logging 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_ac_logging_1.1_b90373128048c882_.log 2025-08-14T22:25:48.2753882Z Running 4 items in this shard: test/functorch/test_ac_logging.py::TestAcLogging::test_create_activation_checkpointing_logging_structure_payload, test/functorch/test_ac_logging.py::TestAcLogging::test_create_joint_graph_edges, test/functorch/test_ac_logging.py::TestAcLogging::test_create_joint_graph_node_information, test/functorch/test_ac_logging.py::TestAcLogging::test_create_structured_trace_for_min_cut_info 2025-08-14T22:25:48.2755409Z 2025-08-14T22:25:52.4214325Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:25:52.4215990Z import pkg_resources 2025-08-14T22:25:52.8083753Z Running test_mkl_verbose 1/1 ... [2025-08-14 22:25:52.808030] 2025-08-14T22:25:52.8084147Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:25:52.8088311Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_mkl_verbose.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:25:52.808430] 2025-08-14T22:25:57.3827241Z 2025-08-14T22:25:57.3828332Z test_mkl_verbose 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_mkl_verbose_1.1_24b447eeeaf7a91f_.log 2025-08-14T22:25:57.3829389Z Running 2 items in this shard: test/test_mkl_verbose.py::TestMKLVerbose::test_verbose_off, test/test_mkl_verbose.py::TestMKLVerbose::test_verbose_on 2025-08-14T22:25:57.3829947Z 2025-08-14T22:26:01.5369155Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:26:01.5370777Z import pkg_resources 2025-08-14T22:26:01.9216012Z Running test_appending_byte_serializer 1/1 ... [2025-08-14 22:26:01.921064] 2025-08-14T22:26:01.9216641Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:26:01.9219453Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_appending_byte_serializer.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:26:01.921545] 2025-08-14T22:26:06.4449597Z 2025-08-14T22:26:06.4450781Z test_appending_byte_serializer 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_appending_byte_serializer_1.1_43bc8f05319ad8e6_.log 2025-08-14T22:26:06.4452802Z Running 3 items in this shard: test/test_appending_byte_serializer.py::TestAppendingByteSerializer::test_checksum, test/test_appending_byte_serializer.py::TestAppendingByteSerializer::test_write_and_read_class, test/test_appending_byte_serializer.py::TestAppendingByteSerializer::test_write_and_read_int 2025-08-14T22:26:06.4453989Z 2025-08-14T22:26:10.6011988Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:26:10.6013312Z import pkg_resources 2025-08-14T22:26:10.9859529Z Running test_autoload 1/1 ... [2025-08-14 22:26:10.985484] 2025-08-14T22:26:10.9859932Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:26:10.9862742Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_autoload.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:26:10.985901] 2025-08-14T22:26:15.5594658Z 2025-08-14T22:26:15.5595554Z test_autoload 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_autoload_1.1_d6478d761f330ede_.log 2025-08-14T22:26:15.5596385Z Running 1 items in this shard: test/test_autoload.py::TestDeviceBackendAutoload::test_autoload 2025-08-14T22:26:15.5596773Z 2025-08-14T22:26:19.7141011Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:26:19.7142700Z import pkg_resources 2025-08-14T22:26:20.1002134Z Running test_proxy_tensor 1/1 ... [2025-08-14 22:26:20.099677] 2025-08-14T22:26:20.1002548Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:26:20.1005101Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_proxy_tensor.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:26:20.100123] 2025-08-14T22:26:26.5770841Z 2025-08-14T22:26:26.5771764Z test_proxy_tensor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_proxy_tensor_1.1_e9eff395dc8ac999_.log 2025-08-14T22:26:26.5823662Z Running 173 items in this shard: test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_allclose, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_amp_cache, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_constant_blowup, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_constant_proxy_tensor_mut, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_constant_random, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_constant_unbind, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_decomp_of_capture, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_decomposition_interpreter, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_empty_like_doesnt_burn_in_defaults, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_inplace_metadata, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_isolated_graphmodule, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_make_fx_model_double_param, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_make_fx_model_fwd_bwd, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_make_fx_model_fwd_bwd_wgtupdate, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_make_fx_overloads, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_make_fx_reentrant_dispatch, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_make_fx_simple, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_mode_tracing_factory_function, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_partial_decomp, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_pickle_issue89626, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_pr_86917, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_pre_dispatch_functionalization, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_pre_dispatch_functionalization_view_op, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_pre_dispatch_linear, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_pre_dispatch_mode_stack, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_pre_dispatch_no_grad, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_proxy_tensor, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_proxy_tensor_mode_with_decomp_table_preserves_proxy, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_resnet18_backward_trace, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_scalar_device, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_strides, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_tensor_constants, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_trace_subclasses, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_val_metadata_mutation, test/test_proxy_tensor.py::TestGenericProxyTensorReal::test_varargs, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_allclose, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_amp_cache, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_constant_blowup, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_constant_proxy_tensor_mut, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_constant_random, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_constant_unbind, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_decomp_of_capture, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_decomposition_interpreter, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_empty_like_doesnt_burn_in_defaults, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_inplace_metadata, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_isolated_graphmodule, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_make_fx_model_double_param, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_make_fx_model_fwd_bwd, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_make_fx_model_fwd_bwd_wgtupdate, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_make_fx_overloads, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_make_fx_reentrant_dispatch, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_make_fx_simple, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_mode_tracing_factory_function, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_partial_decomp, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_pickle_issue89626, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_pr_86917, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_pre_dispatch_functionalization, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_pre_dispatch_functionalization_view_op, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_pre_dispatch_linear, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_pre_dispatch_mode_stack, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_pre_dispatch_no_grad, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_proxy_tensor, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_proxy_tensor_mode_with_decomp_table_preserves_proxy, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_resnet18_backward_trace, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_scalar_device, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_strides, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_tensor_constants, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_trace_subclasses, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_val_metadata_mutation, test/test_proxy_tensor.py::TestGenericProxyTensorFake::test_varargs, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_allclose, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_amp_cache, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_constant_blowup, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_constant_proxy_tensor_mut, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_constant_random, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_constant_unbind, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_decomp_of_capture, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_decomposition_interpreter, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_empty_like_doesnt_burn_in_defaults, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_inplace_metadata, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_isolated_graphmodule, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_make_fx_model_double_param, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_make_fx_model_fwd_bwd, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_make_fx_model_fwd_bwd_wgtupdate, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_make_fx_overloads, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_make_fx_reentrant_dispatch, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_make_fx_simple, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_mode_tracing_factory_function, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_partial_decomp, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_pickle_issue89626, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_pr_86917, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_pre_dispatch_functionalization, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_pre_dispatch_functionalization_view_op, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_pre_dispatch_linear, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_pre_dispatch_mode_stack, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_pre_dispatch_no_grad, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_proxy_tensor, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_proxy_tensor_mode_with_decomp_table_preserves_proxy, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_resnet18_backward_trace, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_scalar_device, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_strides, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_tensor_constants, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_trace_subclasses, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_val_metadata_mutation, test/test_proxy_tensor.py::TestGenericProxyTensorSymbolic::test_varargs, test/test_proxy_tensor.py::TestRealProxyTensor::test_error_on_data_dependent_ops, test/test_proxy_tensor.py::TestFakeProxyTensor::test_alias, test/test_proxy_tensor.py::TestFakeProxyTensor::test_fake_tensor_mode, test/test_proxy_tensor.py::TestFakeProxyTensor::test_free_fake, test/test_proxy_tensor.py::TestFakeProxyTensor::test_fused_adam, test/test_proxy_tensor.py::TestFakeProxyTensor::test_issue82547, test/test_proxy_tensor.py::TestFakeProxyTensor::test_meta, test/test_proxy_tensor.py::TestFakeProxyTensor::test_use_fake_and_tensor, test/test_proxy_tensor.py::TestSymbolicTracing::test_adv_index_batch, test/test_proxy_tensor.py::TestSymbolicTracing::test_arange_unbacked_output_size, test/test_proxy_tensor.py::TestSymbolicTracing::test_binary_broadcast, test/test_proxy_tensor.py::TestSymbolicTracing::test_boolean_index, test/test_proxy_tensor.py::TestSymbolicTracing::test_broadcast_shapes, test/test_proxy_tensor.py::TestSymbolicTracing::test_cat, test/test_proxy_tensor.py::TestSymbolicTracing::test_constant_specialization, test/test_proxy_tensor.py::TestSymbolicTracing::test_cpu_scalar_cuda, test/test_proxy_tensor.py::TestSymbolicTracing::test_cumsum_unbacked, test/test_proxy_tensor.py::TestSymbolicTracing::test_debug_interpreter, test/test_proxy_tensor.py::TestSymbolicTracing::test_deduped_shape, test/test_proxy_tensor.py::TestSymbolicTracing::test_dynamic_pointwise_scalar, test/test_proxy_tensor.py::TestSymbolicTracing::test_elementwise_meta_with_sym_numbers, test/test_proxy_tensor.py::TestSymbolicTracing::test_expand, test/test_proxy_tensor.py::TestSymbolicTracing::test_fake_tensor_as_size, test/test_proxy_tensor.py::TestSymbolicTracing::test_guard_lowerbound_range_refinement, test/test_proxy_tensor.py::TestSymbolicTracing::test_guard_lowerbound_range_refinement_multivariate, test/test_proxy_tensor.py::TestSymbolicTracing::test_guard_upperbound_range_refinement, test/test_proxy_tensor.py::TestSymbolicTracing::test_guard_upperbound_range_refinement_multivariate, test/test_proxy_tensor.py::TestSymbolicTracing::test_guards_equal, test/test_proxy_tensor.py::TestSymbolicTracing::test_int_input, test/test_proxy_tensor.py::TestSymbolicTracing::test_invalidate_nonzero, test/test_proxy_tensor.py::TestSymbolicTracing::test_invalidate_nonzero_propagate_real_tensors, test/test_proxy_tensor.py::TestSymbolicTracing::test_item, test/test_proxy_tensor.py::TestSymbolicTracing::test_item_to_constructor, test/test_proxy_tensor.py::TestSymbolicTracing::test_make_fx_with_custom_tracer_preserving_nn_module_stack, test/test_proxy_tensor.py::TestSymbolicTracing::test_mega_guard, test/test_proxy_tensor.py::TestSymbolicTracing::test_metadata, test/test_proxy_tensor.py::TestSymbolicTracing::test_metadata_fresh, test/test_proxy_tensor.py::TestSymbolicTracing::test_mod_gcd_unbacked, test/test_proxy_tensor.py::TestSymbolicTracing::test_multiply_shape, test/test_proxy_tensor.py::TestSymbolicTracing::test_neg_shape, test/test_proxy_tensor.py::TestSymbolicTracing::test_new_empty, test/test_proxy_tensor.py::TestSymbolicTracing::test_non_deduped_shape, test/test_proxy_tensor.py::TestSymbolicTracing::test_non_symint_size_spec, test/test_proxy_tensor.py::TestSymbolicTracing::test_nonidentity_transitive_guards, test/test_proxy_tensor.py::TestSymbolicTracing::test_reflect_r_over_x, test/test_proxy_tensor.py::TestSymbolicTracing::test_repeat_interleave, test/test_proxy_tensor.py::TestSymbolicTracing::test_repeat_interleave_unbacked_output_size, test/test_proxy_tensor.py::TestSymbolicTracing::test_reshape_divisibility_unbacked, test/test_proxy_tensor.py::TestSymbolicTracing::test_resize_from_zero, test/test_proxy_tensor.py::TestSymbolicTracing::test_return_symint, test/test_proxy_tensor.py::TestSymbolicTracing::test_rmethod, test/test_proxy_tensor.py::TestSymbolicTracing::test_setitem_symint, test/test_proxy_tensor.py::TestSymbolicTracing::test_size_with_tensor, test/test_proxy_tensor.py::TestSymbolicTracing::test_split_unbacked_sizes, test/test_proxy_tensor.py::TestSymbolicTracing::test_sqrt_size, test/test_proxy_tensor.py::TestSymbolicTracing::test_sym_storage_offset, test/test_proxy_tensor.py::TestSymbolicTracing::test_symbolic_repeat_interleave, test/test_proxy_tensor.py::TestSymbolicTracing::test_symint_to_tensor, test/test_proxy_tensor.py::TestSymbolicTracing::test_tensor_symfloat, test/test_proxy_tensor.py::TestSymbolicTracing::test_unary, test/test_proxy_tensor.py::TestSymbolicTracing::test_unbacked_batch_resnet, test/test_proxy_tensor.py::TestSymbolicTracing::test_unbacked_slice, test/test_proxy_tensor.py::TestSymbolicTracing::test_unbacked_unification, test/test_proxy_tensor.py::TestSymbolicTracing::test_unbacked_unify_dependency_violation, test/test_proxy_tensor.py::TestSymbolicTracing::test_unbacked_unify_guard, test/test_proxy_tensor.py::TestSymbolicTracing::test_unbacked_unify_guard_transitivity, test/test_proxy_tensor.py::TestSymbolicTracing::test_view_divisibility_unbacked, test/test_proxy_tensor.py::TestSymbolicTracing::test_view_divisibility_unbacked_relatively_prime 2025-08-14T22:26:26.5873111Z 2025-08-14T22:26:30.7557688Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:26:30.7559021Z import pkg_resources 2025-08-14T22:26:31.1404477Z Running test_mkldnn_verbose 1/1 ... [2025-08-14 22:26:31.139919] 2025-08-14T22:26:31.1404900Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:26:31.1408775Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_mkldnn_verbose.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:26:31.140318] 2025-08-14T22:26:35.6642769Z 2025-08-14T22:26:35.6644186Z test_mkldnn_verbose 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_mkldnn_verbose_1.1_5f2ed14f9a9678a4_.log 2025-08-14T22:26:35.6645510Z Running 2 items in this shard: test/test_mkldnn_verbose.py::TestMKLDNNVerbose::test_verbose_off, test/test_mkldnn_verbose.py::TestMKLDNNVerbose::test_verbose_on 2025-08-14T22:26:35.6646213Z 2025-08-14T22:26:39.8696432Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:26:39.8697757Z import pkg_resources 2025-08-14T22:26:40.2636762Z Running test_utils_config_module 1/1 ... [2025-08-14 22:26:40.263093] 2025-08-14T22:26:40.2637871Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:26:40.2639315Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_utils_config_module.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:26:40.263531] 2025-08-14T22:26:40.8770724Z 2025-08-14T22:26:40.8771690Z test_quantization 1/4 was successful, full logs can be found in artifacts with path test/test-reports/test_quantization_1.4_e7c77d5043e3495a_.log 2025-08-14T22:26:40.8903433Z Running 324 items in this shard: test/test_quantization.py::TestQuantizedOps::test_group_norm, test/test_quantization.py::TestQuantizedOps::test_int8_batch_norm_onednn, test/test_quantization.py::TestQuantizedOps::test_int8_mul_onednn, test/test_quantization.py::TestQuantizedOps::test_leaky_relu_observed_output, test/test_quantization.py::TestQuantizedOps::test_max_pool2d_pt2e, test/test_quantization.py::TestQuantizedOps::test_max_pool3d_nhwc, test/test_quantization.py::TestQuantizedOps::test_mul_scalar_relu, test/test_quantization.py::TestQuantizedOps::test_qadd_relu_same_qparams, test/test_quantization.py::TestQuantizedOps::test_qlayer_norm, test/test_quantization.py::TestQuantizedOps::test_qmul_relu_same_qparams, test/test_quantization.py::TestQuantizedOps::test_quantized_equal, test/test_quantization.py::TestQuantizedOps::test_quantized_mean_qnnpack, test/test_quantization.py::TestQuantizedOps::test_sigmoid, test/test_quantization.py::TestQNNPackOps::test_adaptive_avg_pool2d, test/test_quantization.py::TestQNNPackOps::test_qnnpack_add_broadcast, test/test_quantization.py::TestQNNPackOps::test_qnnpack_sigmoid, test/test_quantization.py::TestQuantizedLinear::test_qlinear_add_relu_pt2e, test/test_quantization.py::TestQuantizedLinear::test_qlinear_leaky_relu, test/test_quantization.py::TestQuantizedLinear::test_qlinear_pt2e, test/test_quantization.py::TestQuantizedLinear::test_qlinear_relu, test/test_quantization.py::TestQuantizedLinear::test_qlinear_sum_fp8, test/test_quantization.py::TestQuantizedLinear::test_qlinear_with_input_q_dq_qweight_dq_output_fp32, test/test_quantization.py::TestQuantizedLinear::test_wrapped_quantized_linear, test/test_quantization.py::TestQuantizedConv::test_conv_reorder_issue_onednn, test/test_quantization.py::TestQuantizedConv::test_qconv2d, test/test_quantization.py::TestQuantizedConv::test_qconv2d_cudnn, test/test_quantization.py::TestQuantizedConv::test_qconv2d_hardswish_pt2e, test/test_quantization.py::TestQuantizedConv::test_qconv2d_hardtanh_fp8, test/test_quantization.py::TestQuantizedConv::test_qconv2d_pt2e, test/test_quantization.py::TestQuantizedConv::test_qconv2d_sum_fp8, test/test_quantization.py::TestQuantizedConv::test_qconv2d_sum_relu_fp8, test/test_quantization.py::TestQuantizedConv::test_qconv2d_swish_pt2e, test/test_quantization.py::TestQuantizedConv::test_qconv3d_unpack, test/test_quantization.py::TestQuantizedConv::test_qconv_transpose2d, test/test_quantization.py::TestDynamicQuantizedOps::test_dynamic_conv1d, test/test_quantization.py::TestDynamicQuantizedOps::test_dynamic_conv2d, test/test_quantization.py::TestDynamicQuantizedOps::test_linear_dynamic_fp16_onednn, test/test_quantization.py::TestDynamicQuantizedOps::test_linear_prepack_fp16_numerics, test/test_quantization.py::TestDynamicQuantizedOps::test_qlinear_legacy, test/test_quantization.py::TestComparatorOps::test_compare_tensor_tensor, test/test_quantization.py::TestPadding::test_constant_padNd, test/test_quantization.py::TestQuantizedEmbeddingOps::test_embedding, test/test_quantization.py::TestQuantizedEmbeddingOps::test_embedding_bag_2d_indices, test/test_quantization.py::TestQuantizedEmbeddingOps::test_embedding_bag_4bit, test/test_quantization.py::TestFakeQuantizeOps::test_backward_per_channel, test/test_quantization.py::TestFakeQuantizeOps::test_backward_per_channel_cachemask_cpu, test/test_quantization.py::TestFakeQuantizeOps::test_backward_per_channel_cachemask_cuda, test/test_quantization.py::TestFakeQuantizeOps::test_fake_quant_per_channel_qparam_range, test/test_quantization.py::TestFakeQuantizeOps::test_forward_per_channel_half_precision_numerics, test/test_quantization.py::TestFakeQuantizeOps::test_fq_module_per_tensor, test/test_quantization.py::TestFakeQuantizeOps::test_learnable_backward_per_tensor_cpu, test/test_quantization.py::TestFakeQuantizeOps::test_learnable_forward_per_tensor_cuda, test/test_quantization.py::TestFusedObsFakeQuant::test_fused_obs_fake_quant_moving_avg_per_channel, test/test_quantization.py::TestQuantizedTensor::test_decomposed_quantize_per_channel, test/test_quantization.py::TestQuantizedTensor::test_dequantize_fp16_cuda, test/test_quantization.py::TestQuantizedTensor::test_per_channel_qtensor_creation_cuda, test/test_quantization.py::TestQuantizedTensor::test_per_tensor_to_device, test/test_quantization.py::TestQuantizedTensor::test_qtensor_channel_float_assignment, test/test_quantization.py::TestQuantizedTensor::test_qtensor_copy, test/test_quantization.py::TestQuantizedTensor::test_qtensor_fill_per_channel, test/test_quantization.py::TestQuantizedTensor::test_qtensor_fill_per_tensor, test/test_quantization.py::TestQuantizedTensor::test_qtensor_float_assignment, test/test_quantization.py::TestQuantizedTensor::test_qtensor_index_select_cpu, test/test_quantization.py::TestQuantizedTensor::test_qtensor_index_select_cuda, test/test_quantization.py::TestQuantizedTensor::test_qtensor_load_save, test/test_quantization.py::TestQuantizedTensor::test_qtensor_masked_fill_cuda, test/test_quantization.py::TestQuantizedTensor::test_qtensor_quantize_per_channel, test/test_quantization.py::TestQuantizedTensor::test_qtensor_reshape, test/test_quantization.py::TestQuantizedTensor::test_qtensor_sub_byte_not_aligned_cols, test/test_quantization.py::TestQuantizedTensor::test_qtensor_view, test/test_quantization.py::TestQuantizedTensor::test_quant_pin_memory, test/test_quantization.py::TestQuantizedTensor::test_repeat, test/test_quantization.py::TestObserver::test_histogram_observer_consistent_buffer_shape, test/test_quantization.py::TestObserver::test_zero_numel, test/test_quantization.py::TestStaticQuantizedModule::test_channel_shuffle, test/test_quantization.py::TestStaticQuantizedModule::test_conv1d_api, test/test_quantization.py::TestStaticQuantizedModule::test_conv1d_relu_api, test/test_quantization.py::TestStaticQuantizedModule::test_conv2d_add, test/test_quantization.py::TestStaticQuantizedModule::test_instance_norm, test/test_quantization.py::TestStaticQuantizedModule::test_layer_norm, test/test_quantization.py::TestStaticQuantizedModule::test_leaky_relu, test/test_quantization.py::TestStaticQuantizedModule::test_pool_api, test/test_quantization.py::TestStaticQuantizedModule::test_prelu, test/test_quantization.py::TestDynamicQuantizedModule::test_dynamic_conv2d, test/test_quantization.py::TestReferenceQuantizedModule::test_rnn, test/test_quantization.py::TestHistogramObserver::test_histogram_observer_one_sided, test/test_quantization.py::TestHistogramObserver::test_histogram_observer_update_within_range_succeeds, test/test_quantization.py::TestDistributed::test_device_affinity, test/test_quantization.py::TestDistributed::test_fake_quant_preserves_buffers, test/test_quantization.py::TestFusedObsFakeQuantModule::test_embedding_qat_config, test/test_quantization.py::TestFusedObsFakeQuantModule::test_fused_mod_reduce_range, test/test_quantization.py::TestFusedObsFakeQuantModule::test_fused_obs_fq_moving_avg_module, test/test_quantization.py::TestBackendConfig::test_backend_op_config_set_fuser_method, test/test_quantization.py::TestBackendConfig::test_backend_op_config_set_input_type_to_index, test/test_quantization.py::TestBackendConfig::test_backend_op_config_set_observation_type, test/test_quantization.py::TestBackendConfig::test_backend_op_config_set_qat_module, test/test_quantization.py::TestBackendConfig::test_dtype_config_to_dict, test/test_quantization.py::TestUtils::test_get_fqn_to_example_inputs_simple, test/test_quantization.py::TestUtils::test_quantize_weight_clamping_per_tensor, test/test_quantization.py::TestUtils::test_uint4_int4_dtype, test/test_quantization.py::TestQuantizeEagerPTQStatic::test_convtranspose_per_channel_qconfig_none, test/test_quantization.py::TestQuantizeEagerPTQStatic::test_custom_module_class, test/test_quantization.py::TestQuantizeEagerPTQStatic::test_quantized_embedding_bag, test/test_quantization.py::TestQuantizeEagerPTQStatic::test_quantwrapper_attaches_qconfig_to_dequant, test/test_quantization.py::TestQuantizeEagerPTQStatic::test_single_layer, test/test_quantization.py::TestQuantizeEagerPTQStatic::test_skip_quant, test/test_quantization.py::TestQuantizeEagerPTQStatic::test_two_layers, test/test_quantization.py::TestQuantizeEagerPTQDynamic::test_forward_hooks_preserved, test/test_quantization.py::TestQuantizeEagerPTQDynamic::test_per_channel_linear_quantize, test/test_quantization.py::TestQuantizeEagerOps::test_conv_transpose_2d, test/test_quantization.py::TestQuantizeEagerQAT::test_defused_embedding_bag_linear, test/test_quantization.py::TestQuantizeEagerQAT::test_embedding_bag_linear, test/test_quantization.py::TestQuantizeEagerQATNumerics::test_linear_bn_numerics, test/test_quantization.py::TestQuantizeEagerQATNumerics::test_linear_bn_symm_numerics, test/test_quantization.py::TestFuseEager::test_fuse_module_eval, test/test_quantization.py::TestFuseEager::test_fuse_modules_with_nested_hooks, test/test_quantization.py::TestFuseEager::test_fusion_conv_with_bias, test/test_quantization.py::TestNumericSuiteEager::test_compare_model_outputs_linear_dynamic, test/test_quantization.py::TestNumericSuiteEager::test_compare_model_outputs_lstm_dynamic, test/test_quantization.py::TestNumericSuiteEager::test_compare_model_stub_linear_dynamic, test/test_quantization.py::TestNumericSuiteEager::test_mobilenet_v3, test/test_quantization.py::TestEqualizeEager::test_cross_layer_equalization, test/test_quantization.py::TestEqualizeEager::test_equalize, test/test_quantization.py::TestEqualizeEager::test_equalize_fused_convrelu, test/test_quantization.py::TestBiasCorrectionEager::test_conv_chain, test/test_quantization.py::TestFuseFx::test_fuse_conv_bn_relu, test/test_quantization.py::TestQuantizeFx::test_conv_linear_reference, test/test_quantization.py::TestQuantizeFx::test_conv_transpose_not_reference, test/test_quantization.py::TestQuantizeFx::test_conv_transpose_reference, test/test_quantization.py::TestQuantizeFx::test_convert_custom_config_set_observed_to_quantized_mapping, test/test_quantization.py::TestQuantizeFx::test_deepcopy_preserve_attributes, test/test_quantization.py::TestQuantizeFx::test_default_qconfig_mapping_override_global, test/test_quantization.py::TestQuantizeFx::test_dynamic_quant_fp16, test/test_quantization.py::TestQuantizeFx::test_fuse_custom_config_set_preserved_attributes, test/test_quantization.py::TestQuantizeFx::test_get_executorch_backend_config, test/test_quantization.py::TestQuantizeFx::test_linear_tanh_lowering, test/test_quantization.py::TestQuantizeFx::test_mul_add_fp16_config, test/test_quantization.py::TestQuantizeFx::test_no_obs_between_unmatched_node_and_copy_node, test/test_quantization.py::TestQuantizeFx::test_non_traceable_module, test/test_quantization.py::TestQuantizeFx::test_prepare_custom_config_set_output_quantized_indexes, test/test_quantization.py::TestQuantizeFx::test_prepare_custom_config_set_preserved_attributes, test/test_quantization.py::TestQuantizeFx::test_prepared_model_deepcopy, test/test_quantization.py::TestQuantizeFx::test_preserve_attributes, test/test_quantization.py::TestQuantizeFx::test_propagate_dtypes_for_known_nodes_dict_split_tuple_args, test/test_quantization.py::TestQuantizeFx::test_qat_and_script, test/test_quantization.py::TestQuantizeFx::test_qat_skip_untraced, test/test_quantization.py::TestQuantizeFx::test_qconfig_dict_with_fused_modules, test/test_quantization.py::TestQuantizeFx::test_qconfig_for_call_func, test/test_quantization.py::TestQuantizeFx::test_qconfig_mapping_repr, test/test_quantization.py::TestQuantizeFx::test_qconfig_mapping_to_dict, test/test_quantization.py::TestQuantizeFx::test_qparams_fqn, test/test_quantization.py::TestQuantizeFx::test_quantized_input_fp32_output, test/test_quantization.py::TestQuantizeFx::test_relu_lowering, test/test_quantization.py::TestQuantizeFx::test_reshape_nontensor_args_not_observed, test/test_quantization.py::TestQuantizeFx::test_sequential, test/test_quantization.py::TestQuantizeFx::test_static_lstm, test/test_quantization.py::TestQuantizeFx::test_static_lstm_with_custom_fixed_qparams, test/test_quantization.py::TestQuantizeFxOps::test_add, test/test_quantization.py::TestQuantizeFxOps::test_fixed_qparams_ops_wrong_qconfig, test/test_quantization.py::TestQuantizeFxOps::test_gelu_reference, test/test_quantization.py::TestQuantizeFxOps::test_general_shape_ops, test/test_quantization.py::TestQuantizeFxOps::test_general_value_ops, test/test_quantization.py::TestQuantizeFxOps::test_linear_module, test/test_quantization.py::TestQuantizeFxOps::test_linear_static_fp16, test/test_quantization.py::TestQuantizeFxOps::test_mul, test/test_quantization.py::TestQuantizeFxOps::test_mul_relu, test/test_quantization.py::TestQuantizeFxOps::test_narrow, test/test_quantization.py::TestQuantizeFxOps::test_pixel_unshuffle, test/test_quantization.py::TestQuantizeFxOps::test_prelu, test/test_quantization.py::TestQuantizeFxOps::test_ref_pattern_multi_use, test/test_quantization.py::TestQuantizeFxOps::test_reshape_fp16, test/test_quantization.py::TestQuantizeFxOps::test_softmax_normal, test/test_quantization.py::TestQuantizeFxModels::test_prepare_serialize_switch_device_convert, test/test_quantization.py::TestQuantizeFxModels::test_qat_embeddingbag_linear, test/test_quantization.py::TestQuantizeFxModels::test_static_gpu_convert_basic, test/test_quantization.py::TestQuantizeFxModels::test_torchvision, test/test_quantization.py::TestSubgraphRewriter::test_subgraph_rewriter_annotations_int, test/test_quantization.py::TestSubgraphRewriter::test_subgraph_rewriter_replaces_referenced_submodules, test/test_quantization.py::TestGraphUtils::test_customized_equivalet_types_dict, test/test_quantization.py::TestDuplicateDQPass::test_no_need_for_duplicate_dq, test/test_quantization.py::TestMetaDataPorting::test_metadata_porting_for_dq, test/test_quantization.py::TestMetaDataPorting::test_metadata_porting_for_two_dq, test/test_quantization.py::TestNumericDebugger::test_added_node_gets_unique_id, test/test_quantization.py::TestNumericDebugger::test_copy_preserve_handle, test/test_quantization.py::TestNumericDebugger::test_deepcopy_preserve_handle, test/test_quantization.py::TestNumericDebugger::test_extract_results_from_loggers, test/test_quantization.py::TestNumericDebugger::test_quantize_pt2e_preserve_handle, test/test_quantization.py::TestNumericDebugger::test_re_export_preserve_handle, test/test_quantization.py::TestQuantizePT2E::test_allow_implicit_sharing, test/test_quantization.py::TestQuantizePT2E::test_composable_quantizer_linear_conv, test/test_quantization.py::TestQuantizePT2E::test_composable_quantizer_transform_for_annotation, test/test_quantization.py::TestQuantizePT2E::test_fold_all_ops_before_quantize, test/test_quantization.py::TestQuantizePT2E::test_move_exported_model_dropout_inplace, test/test_quantization.py::TestQuantizePT2E::test_quantization_dtype_float32_int16, test/test_quantization.py::TestQuantizePT2E::test_save_load, test/test_quantization.py::TestQuantizePT2E::test_shared_qspec_transitivity_case_2, test/test_quantization.py::TestQuantizePT2E::test_speed, test/test_quantization.py::TestQuantizePT2EAffineQuantization::test_dynamic_per_tok_act_per_group_weights, test/test_quantization.py::TestPT2ERepresentation::test_qdq, test/test_quantization.py::TestXNNPACKQuantizer::test_add_and_inplace_add, test/test_quantization.py::TestXNNPACKQuantizer::test_add_mul_long, test/test_quantization.py::TestXNNPACKQuantizer::test_add_mul_scalar, test/test_quantization.py::TestXNNPACKQuantizer::test_conv2d, test/test_quantization.py::TestXNNPACKQuantizer::test_conv_linear, test/test_quantization.py::TestXNNPACKQuantizer::test_propagate_annotation, test/test_quantization.py::TestXNNPACKQuantizer::test_set_module_name, test/test_quantization.py::TestXNNPACKQuantizer::test_set_module_type_case_2, test/test_quantization.py::TestQuantizePT2EX86Inductor::test_avg_pool2d_recipe, test/test_quantization.py::TestQuantizePT2EX86Inductor::test_conv2d, test/test_quantization.py::TestQuantizePT2EX86Inductor::test_conv2d_binary_unary, test/test_quantization.py::TestQuantizePT2EX86Inductor::test_linear_binary_dynamic, test/test_quantization.py::TestQuantizePT2EX86Inductor::test_linear_binary_qat, test/test_quantization.py::TestQuantizePT2EX86Inductor::test_linear_binary_unary_dynamic, test/test_quantization.py::TestQuantizePT2EX86Inductor::test_linear_dynamic_fp16, test/test_quantization.py::TestQuantizePT2EX86Inductor::test_linear_unary_qat, test/test_quantization.py::TestQuantizePT2EX86Inductor::test_qat_conv2d, test/test_quantization.py::TestQuantizePT2EX86Inductor::test_qat_conv2d_binary2, test/test_quantization.py::TestQuantizePT2EX86Inductor::test_qat_conv2d_binary_unary, test/test_quantization.py::TestQuantizePT2EQAT_ConvBn1d::test_fold_bn_erases_bn_node, test/test_quantization.py::TestQuantizePT2EQAT_ConvBn1d::test_qat_conv_bn_relu_fusion_cuda, test/test_quantization.py::TestQuantizePT2EQAT_ConvBn1d::test_qat_conv_transpose_bn_relu, test/test_quantization.py::TestQuantizePT2EQAT_ConvBn2d::test_fold_bn_erases_bn_node, test/test_quantization.py::TestQuantizePT2EQAT_ConvBn2d::test_qat_conv_bn_fusion_cuda, test/test_quantization.py::TestQuantizePT2EQAT_ConvBn2d::test_qat_conv_bn_fusion_no_conv_bias, test/test_quantization.py::TestQuantizePT2EQAT_ConvBn2d::test_qat_conv_bn_relu_fusion, test/test_quantization.py::TestQuantizePT2EQAT_ConvBn2d::test_qat_conv_transpose_bn, test/test_quantization.py::TestQuantizePT2EQAT_ConvBn2d::test_qat_conv_transpose_bn_relu, test/test_quantization.py::TestFXNumericSuiteCoreAPIs::test_add_shadow_loggers_mod_qat, test/test_quantization.py::TestFXNumericSuiteCoreAPIs::test_extract_weights_conv_fun_ptq, test/test_quantization.py::TestFXNumericSuiteCoreAPIs::test_extract_weights_dynamic, test/test_quantization.py::TestFXNumericSuiteCoreAPIs::test_extract_weights_linear_fun_ptq, test/test_quantization.py::TestFXNumericSuiteCoreAPIs::test_extract_weights_mod_ptq, test/test_quantization.py::TestFXNumericSuiteCoreAPIs::test_extract_weights_mod_qat, test/test_quantization.py::TestFXNumericSuiteCoreAPIs::test_fp16_shadows_fp32, test/test_quantization.py::TestFXNumericSuiteCoreAPIs::test_int8_shadows_fp32_coverage, test/test_quantization.py::TestFXNumericSuiteCoreAPIs::test_int8_shadows_int8_fun, test/test_quantization.py::TestFXNumericSuiteCoreAPIs::test_linear_fp16_vs_linear_fp16_shadow_activations, test/test_quantization.py::TestFXNumericSuiteCoreAPIs::test_linear_kwargs_shadow, test/test_quantization.py::TestFXNumericSuiteCoreAPIs::test_match_activations_mod_ptq, test/test_quantization.py::TestFXNumericSuiteCoreAPIs::test_match_activations_mod_qat, test/test_quantization.py::TestFXNumericSuiteCoreAPIs::test_op_io_dtype_coverage, test/test_quantization.py::TestFXNumericSuiteCoreAPIs::test_op_with_either_fp32_or_int8_input, test/test_quantization.py::TestFXNumericSuiteCoreAPIs::test_ops_with_same_fp32_and_int8_signature, test/test_quantization.py::TestFXNumericSuiteCoreAPIs::test_shadow_activations_fqn, test/test_quantization.py::TestFXNumericSuiteCoreAPIs::test_shadow_loggers_preserve_qat_numerics, test/test_quantization.py::TestFXNumericSuiteCoreAPIs::test_user_module, test/test_quantization.py::TestFXNumericSuiteNShadows::test_add_loggers_conv_bn_relu_fusion_quant, test/test_quantization.py::TestFXNumericSuiteNShadows::test_add_loggers_linear_mod_fp32_quant, test/test_quantization.py::TestFXNumericSuiteNShadows::test_custom_functions_and_tracer, test/test_quantization.py::TestFXNumericSuiteNShadows::test_logger_enabled_and_save_activations_flags, test/test_quantization.py::TestFXNumericSuiteNShadows::test_mobilenet_v2, test/test_quantization.py::TestFXNumericSuiteNShadows::test_qconfig_multi_mapping_end_to_end, test/test_quantization.py::TestFXNumericSuiteNShadows::test_qconfig_multi_mapping_from_list, test/test_quantization.py::TestFXNumericSuiteNShadows::test_qconfig_multi_mapping_repr, test/test_quantization.py::TestFXNumericSuiteCoreAPIsModels::test_compare_shadow_activations_linear, test/test_quantization.py::TestFXNumericSuiteCoreAPIsModels::test_resnet18, test/test_quantization.py::TestFXNumericSuiteCoreAPIsModels::test_sparsenn_compare_activations, test/test_quantization.py::TestFxModelReportDetector::test_conv_sub_class_considered, test/test_quantization.py::TestFxModelReportDetector::test_fusion_layer_in_sequential, test/test_quantization.py::TestFxModelReportDetector::test_qat_aware_model_example, test/test_quantization.py::TestFxModelReportClass::test_generate_visualizer, test/test_quantization.py::TestFxDetectOutliers::test_multiple_run_consistent_spike_outlier_report_gen, test/test_quantization.py::TestFxDetectOutliers::test_outlier_detection_determine_points, test/test_quantization.py::TestFxModelReportVisualizer::test_get_modules_and_features, test/test_quantization.py::TestEqualizeFx::test_input_weight_eq_observer, test/test_quantization.py::TestEqualizeFx::test_input_weight_equalization_activation_values, test/test_quantization.py::TestEqualizeFx::test_input_weight_equalization_branching, test/test_quantization.py::TestEqualizeFx::test_selective_equalization, test/test_quantization.py::TestSerialization::test_conv2d, test/test_quantization.py::TestSerialization::test_conv2d_nobias, test/test_quantization.py::TestSerialization::test_conv3d, test/test_quantization.py::TestSerialization::test_linear_dynamic, test/test_quantization.py::TestSerialization::test_linear_relu_package_quantization_transforms, test/test_quantization.py::TestSerialization::test_per_channel_observer, test/test_quantization.py::TestSerialization::test_per_tensor_observer, test/test_quantization.py::TestQuantizeJit::test_conv, test/test_quantization.py::TestQuantizeJit::test_conv_bn, test/test_quantization.py::TestQuantizeJit::test_conv_transpose, test/test_quantization.py::TestQuantizeJit::test_single_linear_dynamic, test/test_quantization.py::TestQuantizeJitPasses::test_convtranspose_trace, test/test_quantization.py::TestQuantizeJitPasses::test_foldbn_complex_cases, test/test_quantization.py::TestQuantizeJitPasses::test_foldbn_in_submodule, test/test_quantization.py::TestQuantizeJitPasses::test_fuse_linear, test/test_quantization.py::TestQuantizeJitPasses::test_insert_observers_for_general_ops, test/test_quantization.py::TestQuantizeJitPasses::test_insert_observers_interface, test/test_quantization.py::TestQuantizeJitPasses::test_insert_observers_weight_dtype, test/test_quantization.py::TestQuantizeJitPasses::test_swap_functional_linear, test/test_quantization.py::TestQuantizeJitOps::test_cat_linear, test/test_quantization.py::TestQuantizeJitOps::test_general_shape_ops, test/test_quantization.py::TestQuantizeJitOps::test_general_value_ops, test/test_quantization.py::TestQuantizeJitOps::test_qbatch_norm_relu_BNRelu, test/test_quantization.py::TestQuantizeJitOps::test_quantized_add_alpha, test/test_quantization.py::TestQuantizeJitOps::test_quantized_add_scalar, test/test_quantization.py::TestQuantizeJitOps::test_quantized_conv, test/test_quantization.py::TestQuantizeJitOps::test_quantized_mul_scalar, test/test_quantization.py::TestQuantizeDynamicJitPasses::test_convert_dynamic_fp16, test/test_quantization.py::TestQuantizeDynamicJitPasses::test_dynamic_shared_weights, test/test_quantization.py::TestQuantizeDynamicJitOps::test_linear, test/test_quantization.py::TestAOMigrationQuantization::test_function_import_fuse_modules, test/test_quantization.py::TestAOMigrationQuantization::test_function_import_observer, test/test_quantization.py::TestAOMigrationQuantization::test_function_import_qconfig, test/test_quantization.py::TestAOMigrationQuantization::test_function_import_utils, test/test_quantization.py::TestAOMigrationNNQuantized::test_import_nn_quantizable_activation, test/test_quantization.py::TestAOMigrationNNQuantized::test_modules_conv, test/test_quantization.py::TestAOMigrationNNQuantized::test_modules_embedding_ops, test/test_quantization.py::TestAOMigrationNNQuantized::test_modules_utils, test/test_quantization.py::TestAOMigrationNNIntrinsic::test_modules_import_nn_intrinsic, test/test_quantization.py::TestAOMigrationNNIntrinsic::test_modules_intrinsic_qat_linear_fused, test/test_quantization.py::TestAOMigrationNNIntrinsic::test_modules_no_import_nn_intrinsic_quantized_dynamic, test/test_quantization.py::TestAOMigrationQuantizationFx::test_function_import_fx_equalize, test/test_quantization.py::TestAOMigrationQuantizationFx::test_function_import_fx_utils, test/test_quantization.py::TestBitsCUDA::test_types_cuda, test/test_quantization.py::TestFloat8DtypeCUDA::test_cast_round_trip_rte_cuda_float8_e4m3fn, test/test_quantization.py::TestFloat8DtypeCUDA::test_cast_round_trip_soak_cuda_float8_e8m0fnu, test/test_quantization.py::TestFloat8DtypeCUDA::test_cast_round_trip_subnormals_cuda_float8_e4m3fn, test/test_quantization.py::TestFloat8DtypeCUDA::test_cat_cuda_float8_e5m2, test/test_quantization.py::TestFloat8DtypeCUDA::test_creation_with_zeros_cuda_float8_e5m2, test/test_quantization.py::TestFloat8DtypeCUDA::test_creation_with_zeros_cuda_float8_e8m0fnu, test/test_quantization.py::TestFloat8DtypeCUDA::test_finfo_cuda_float8_e4m3fnuz, test/test_quantization.py::TestFloat8DtypeCUDA::test_finfo_cuda_float8_e8m0fnu, test/test_quantization.py::TestFloat8DtypeCUDA::test_save_load_cuda_float8_e8m0fnu, test/test_quantization.py::TestFloat8DtypeCUDA::test_special_numbers_cuda_float8_e8m0fnu, test/test_quantization.py::TestFloat8DtypeCUDA::test_type_promotion_fails_cuda_float8_e4m3fn, test/test_quantization.py::TestFloat8DtypeCUDA::test_type_promotion_fails_cuda_float8_e5m2 2025-08-14T22:26:40.9030368Z 2025-08-14T22:26:44.9375382Z 2025-08-14T22:26:44.9376147Z test_utils_config_module 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_utils_config_module_1.1_fd8e3bd90c7dceea_.log 2025-08-14T22:26:44.9382670Z Running 22 items in this shard: test/test_utils_config_module.py::TestConfigModule::test_alias, test/test_utils_config_module.py::TestConfigModule::test_bad_jk_type, test/test_utils_config_module.py::TestConfigModule::test_base_value_loading, test/test_utils_config_module.py::TestConfigModule::test_codegen_config, test/test_utils_config_module.py::TestConfigModule::test_codegen_config_function, test/test_utils_config_module.py::TestConfigModule::test_dict_copy_semantics, test/test_utils_config_module.py::TestConfigModule::test_env_name_semantics, test/test_utils_config_module.py::TestConfigModule::test_env_name_string_semantics, test/test_utils_config_module.py::TestConfigModule::test_get_hash, test/test_utils_config_module.py::TestConfigModule::test_invalid_config_float, test/test_utils_config_module.py::TestConfigModule::test_invalid_config_int, test/test_utils_config_module.py::TestConfigModule::test_make_closur_patcher, test/test_utils_config_module.py::TestConfigModule::test_multi_env, test/test_utils_config_module.py::TestConfigModule::test_none_override_semantics, test/test_utils_config_module.py::TestConfigModule::test_overrides, test/test_utils_config_module.py::TestConfigModule::test_patch, test/test_utils_config_module.py::TestConfigModule::test_reference_is_default, test/test_utils_config_module.py::TestConfigModule::test_reference_semantics, test/test_utils_config_module.py::TestConfigModule::test_save_config, test/test_utils_config_module.py::TestConfigModule::test_save_config_portable, test/test_utils_config_module.py::TestConfigModule::test_type_loading, test/test_utils_config_module.py::TestConfigModule::test_unittest_patch 2025-08-14T22:26:44.9388656Z 2025-08-14T22:26:45.0438257Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:26:45.0440725Z import pkg_resources 2025-08-14T22:26:45.4398438Z Running test_functionalization 1/1 ... [2025-08-14 22:26:45.439362] 2025-08-14T22:26:45.4398857Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:26:45.4401511Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_functionalization.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:26:45.439838] 2025-08-14T22:26:49.0628428Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:26:49.0631248Z import pkg_resources 2025-08-14T22:26:49.4384200Z Running test_rename_privateuse1_to_existing_device 1/1 ... [2025-08-14 22:26:49.437842] 2025-08-14T22:26:49.4384710Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:26:49.4385760Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_rename_privateuse1_to_existing_device.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:26:49.438219] 2025-08-14T22:26:50.2637784Z 2025-08-14T22:26:50.2638570Z test_functionalization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_functionalization_1.1_1945728fa9bb5a9b_.log 2025-08-14T22:26:50.2676387Z Running 112 items in this shard: test/test_functionalization.py::TestFunctionalization::test_advanced_indexing, test/test_functionalization.py::TestFunctionalization::test_advanced_indexing_correct_strides, test/test_functionalization.py::TestFunctionalization::test_aliases_maintained_after_pass_when_reapplying_views, test/test_functionalization.py::TestFunctionalization::test_as_strided, test/test_functionalization.py::TestFunctionalization::test_batch_norm, test/test_functionalization.py::TestFunctionalization::test_cat, test/test_functionalization.py::TestFunctionalization::test_channels_last_contiguous, test/test_functionalization.py::TestFunctionalization::test_copy_, test/test_functionalization.py::TestFunctionalization::test_copy_stride_mismatch, test/test_functionalization.py::TestFunctionalization::test_diagonal, test/test_functionalization.py::TestFunctionalization::test_diagonal_mutated_input, test/test_functionalization.py::TestFunctionalization::test_everything, test/test_functionalization.py::TestFunctionalization::test_expand_symint, test/test_functionalization.py::TestFunctionalization::test_fill_, test/test_functionalization.py::TestFunctionalization::test_freeze, test/test_functionalization.py::TestFunctionalization::test_index_mutation_on_non_input, test/test_functionalization.py::TestFunctionalization::test_inplace_on_non_view, test/test_functionalization.py::TestFunctionalization::test_instance_norm, test/test_functionalization.py::TestFunctionalization::test_metadata_change, test/test_functionalization.py::TestFunctionalization::test_metadata_change_out_op, test/test_functionalization.py::TestFunctionalization::test_mixed_wrappers_invalid, test/test_functionalization.py::TestFunctionalization::test_mixed_wrappers_valid, test/test_functionalization.py::TestFunctionalization::test_multi_out, test/test_functionalization.py::TestFunctionalization::test_multiple_views_of_same_base, test/test_functionalization.py::TestFunctionalization::test_mutable_op_not_inplace_or_other, test/test_functionalization.py::TestFunctionalization::test_mutation_overlapping_mem, test/test_functionalization.py::TestFunctionalization::test_nested_functions_propagate_updates, test/test_functionalization.py::TestFunctionalization::test_only_one_view, test/test_functionalization.py::TestFunctionalization::test_optional_tensor_list, test/test_functionalization.py::TestFunctionalization::test_python_functionalization, test/test_functionalization.py::TestFunctionalization::test_python_functionalization_conj, test/test_functionalization.py::TestFunctionalization::test_python_functionalization_is_conj, test/test_functionalization.py::TestFunctionalization::test_python_functionalization_is_neg, test/test_functionalization.py::TestFunctionalization::test_python_functionalization_lift_fresh, test/test_functionalization.py::TestFunctionalization::test_python_functionalization_lift_fresh_storage, test/test_functionalization.py::TestFunctionalization::test_python_functionalization_neg, test/test_functionalization.py::TestFunctionalization::test_python_functionalization_zero_tensor, test/test_functionalization.py::TestFunctionalization::test_reapply_views_simple, test/test_functionalization.py::TestFunctionalization::test_resize_larger_invalid, test/test_functionalization.py::TestFunctionalization::test_resize_larger_valid, test/test_functionalization.py::TestFunctionalization::test_resize_same_size_diff_rank, test/test_functionalization.py::TestFunctionalization::test_resize_smaller, test/test_functionalization.py::TestFunctionalization::test_save_for_backwards_segfault, test/test_functionalization.py::TestFunctionalization::test_scalars, test/test_functionalization.py::TestFunctionalization::test_set_, test/test_functionalization.py::TestFunctionalization::test_simple, test/test_functionalization.py::TestFunctionalization::test_simple_out, test/test_functionalization.py::TestFunctionalization::test_slice, test/test_functionalization.py::TestFunctionalization::test_split, test/test_functionalization.py::TestFunctionalization::test_split_with_sizes, test/test_functionalization.py::TestFunctionalization::test_tensor_ctr, test/test_functionalization.py::TestFunctionalization::test_tensor_list_composite, test/test_functionalization.py::TestFunctionalization::test_tensor_list_mixed_functional_nonfunctional, test/test_functionalization.py::TestFunctionalization::test_unbind, test/test_functionalization.py::TestFunctionalization::test_view_clone_view_inplace, test/test_functionalization.py::TestFunctionalization::test_view_inplace, test/test_functionalization.py::TestCrossRefFunctionalization::test_advanced_indexing, test/test_functionalization.py::TestCrossRefFunctionalization::test_advanced_indexing_correct_strides, test/test_functionalization.py::TestCrossRefFunctionalization::test_aliases_maintained_after_pass_when_reapplying_views, test/test_functionalization.py::TestCrossRefFunctionalization::test_as_strided, test/test_functionalization.py::TestCrossRefFunctionalization::test_batch_norm, test/test_functionalization.py::TestCrossRefFunctionalization::test_cat, test/test_functionalization.py::TestCrossRefFunctionalization::test_channels_last_contiguous, test/test_functionalization.py::TestCrossRefFunctionalization::test_copy_, test/test_functionalization.py::TestCrossRefFunctionalization::test_copy_stride_mismatch, test/test_functionalization.py::TestCrossRefFunctionalization::test_diagonal, test/test_functionalization.py::TestCrossRefFunctionalization::test_diagonal_mutated_input, test/test_functionalization.py::TestCrossRefFunctionalization::test_everything, test/test_functionalization.py::TestCrossRefFunctionalization::test_expand_symint, test/test_functionalization.py::TestCrossRefFunctionalization::test_fill_, test/test_functionalization.py::TestCrossRefFunctionalization::test_freeze, test/test_functionalization.py::TestCrossRefFunctionalization::test_index_mutation_on_non_input, test/test_functionalization.py::TestCrossRefFunctionalization::test_inplace_on_non_view, test/test_functionalization.py::TestCrossRefFunctionalization::test_instance_norm, test/test_functionalization.py::TestCrossRefFunctionalization::test_metadata_change, test/test_functionalization.py::TestCrossRefFunctionalization::test_metadata_change_out_op, test/test_functionalization.py::TestCrossRefFunctionalization::test_mixed_wrappers_invalid, test/test_functionalization.py::TestCrossRefFunctionalization::test_mixed_wrappers_valid, test/test_functionalization.py::TestCrossRefFunctionalization::test_multi_out, test/test_functionalization.py::TestCrossRefFunctionalization::test_multiple_views_of_same_base, test/test_functionalization.py::TestCrossRefFunctionalization::test_mutable_op_not_inplace_or_other, test/test_functionalization.py::TestCrossRefFunctionalization::test_mutation_overlapping_mem, test/test_functionalization.py::TestCrossRefFunctionalization::test_nested_functions_propagate_updates, test/test_functionalization.py::TestCrossRefFunctionalization::test_only_one_view, test/test_functionalization.py::TestCrossRefFunctionalization::test_optional_tensor_list, test/test_functionalization.py::TestCrossRefFunctionalization::test_python_functionalization, test/test_functionalization.py::TestCrossRefFunctionalization::test_python_functionalization_conj, test/test_functionalization.py::TestCrossRefFunctionalization::test_python_functionalization_is_conj, test/test_functionalization.py::TestCrossRefFunctionalization::test_python_functionalization_is_neg, test/test_functionalization.py::TestCrossRefFunctionalization::test_python_functionalization_lift_fresh, test/test_functionalization.py::TestCrossRefFunctionalization::test_python_functionalization_lift_fresh_storage, test/test_functionalization.py::TestCrossRefFunctionalization::test_python_functionalization_neg, test/test_functionalization.py::TestCrossRefFunctionalization::test_python_functionalization_zero_tensor, test/test_functionalization.py::TestCrossRefFunctionalization::test_reapply_views_simple, test/test_functionalization.py::TestCrossRefFunctionalization::test_resize_larger_invalid, test/test_functionalization.py::TestCrossRefFunctionalization::test_resize_larger_valid, test/test_functionalization.py::TestCrossRefFunctionalization::test_resize_same_size_diff_rank, test/test_functionalization.py::TestCrossRefFunctionalization::test_resize_smaller, test/test_functionalization.py::TestCrossRefFunctionalization::test_save_for_backwards_segfault, test/test_functionalization.py::TestCrossRefFunctionalization::test_scalars, test/test_functionalization.py::TestCrossRefFunctionalization::test_set_, test/test_functionalization.py::TestCrossRefFunctionalization::test_simple, test/test_functionalization.py::TestCrossRefFunctionalization::test_simple_out, test/test_functionalization.py::TestCrossRefFunctionalization::test_slice, test/test_functionalization.py::TestCrossRefFunctionalization::test_split, test/test_functionalization.py::TestCrossRefFunctionalization::test_split_with_sizes, test/test_functionalization.py::TestCrossRefFunctionalization::test_tensor_ctr, test/test_functionalization.py::TestCrossRefFunctionalization::test_tensor_list_composite, test/test_functionalization.py::TestCrossRefFunctionalization::test_tensor_list_mixed_functional_nonfunctional, test/test_functionalization.py::TestCrossRefFunctionalization::test_unbind, test/test_functionalization.py::TestCrossRefFunctionalization::test_view_clone_view_inplace, test/test_functionalization.py::TestCrossRefFunctionalization::test_view_inplace 2025-08-14T22:26:50.2712237Z 2025-08-14T22:26:53.9116307Z 2025-08-14T22:26:53.9117446Z test_rename_privateuse1_to_existing_device 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_rename_privateuse1_to_existing_device_1.1_21d33a157b8b10f8_.log 2025-08-14T22:26:53.9118862Z Running 1 items in this shard: test/test_rename_privateuse1_to_existing_device.py::TestRenamePrivateuseoneToExistingBackend::test_external_module_register_with_existing_backend 2025-08-14T22:26:53.9119589Z 2025-08-14T22:26:54.4595262Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:26:54.4596587Z import pkg_resources 2025-08-14T22:26:54.8523128Z Running test_transformers 1/1 ... [2025-08-14 22:26:54.851828] 2025-08-14T22:26:54.8524212Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:26:54.8525659Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_transformers.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:26:54.852225] 2025-08-14T22:26:58.0830794Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:26:58.0832117Z import pkg_resources 2025-08-14T22:26:58.4680238Z Running test_ao_sparsity 1/1 ... [2025-08-14 22:26:58.467501] 2025-08-14T22:26:58.4680728Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:26:58.4682898Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ao_sparsity.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:26:58.467933] 2025-08-14T22:27:03.5921610Z 2025-08-14T22:27:03.5922743Z test_ao_sparsity 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_ao_sparsity_1.1_9bf0e22d6c465b9f_.log 2025-08-14T22:27:03.5947465Z Running 88 items in this shard: test/test_ao_sparsity.py::TestQuantizedSparseKernels::test_sparse_qlinear, test/test_ao_sparsity.py::TestQuantizedSparseLayers::test_sparse_qlinear, test/test_ao_sparsity.py::TestQuantizedSparseLayers::test_sparse_qlinear_serdes, test/test_ao_sparsity.py::TestFakeSparsity::test_jit_trace, test/test_ao_sparsity.py::TestFakeSparsity::test_masking_logic, test/test_ao_sparsity.py::TestFakeSparsity::test_state_dict_preserved, test/test_ao_sparsity.py::TestFakeSparsity::test_weights_parametrized, test/test_ao_sparsity.py::TestCubicScheduler::test_constructor, test/test_ao_sparsity.py::TestCubicScheduler::test_step, test/test_ao_sparsity.py::TestScheduler::test_constructor, test/test_ao_sparsity.py::TestScheduler::test_lambda_scheduler, test/test_ao_sparsity.py::TestScheduler::test_order_of_steps, test/test_ao_sparsity.py::TestScheduler::test_step, test/test_ao_sparsity.py::TestBaseSparsifier::test_constructor, test/test_ao_sparsity.py::TestBaseSparsifier::test_convert, test/test_ao_sparsity.py::TestBaseSparsifier::test_mask_squash, test/test_ao_sparsity.py::TestBaseSparsifier::test_mask_squash_with_params1, test/test_ao_sparsity.py::TestBaseSparsifier::test_mask_squash_with_params2, test/test_ao_sparsity.py::TestBaseSparsifier::test_mask_squash_with_params3, test/test_ao_sparsity.py::TestBaseSparsifier::test_prepare_config, test/test_ao_sparsity.py::TestBaseSparsifier::test_state_dict, test/test_ao_sparsity.py::TestBaseSparsifier::test_step, test/test_ao_sparsity.py::TestNearlyDiagonalSparsifier::test_constructor, test/test_ao_sparsity.py::TestNearlyDiagonalSparsifier::test_mask_squash, test/test_ao_sparsity.py::TestNearlyDiagonalSparsifier::test_prepare, test/test_ao_sparsity.py::TestNearlyDiagonalSparsifier::test_sparsity_levels, test/test_ao_sparsity.py::TestNearlyDiagonalSparsifier::test_step, test/test_ao_sparsity.py::TestWeightNormSparsifier::test_constructor, test/test_ao_sparsity.py::TestWeightNormSparsifier::test_mask_squash, test/test_ao_sparsity.py::TestWeightNormSparsifier::test_prepare, test/test_ao_sparsity.py::TestWeightNormSparsifier::test_sparsity_levels, test/test_ao_sparsity.py::TestWeightNormSparsifier::test_step, test/test_ao_sparsity.py::TestWeightNormSparsifier::test_step_2_of_4, test/test_ao_sparsity.py::TestBaseStructuredSparsifier::test_complex_conv2d, test/test_ao_sparsity.py::TestBaseStructuredSparsifier::test_constructor, test/test_ao_sparsity.py::TestBaseStructuredSparsifier::test_prepare_conv2d, test/test_ao_sparsity.py::TestBaseStructuredSparsifier::test_prepare_linear, test/test_ao_sparsity.py::TestBaseStructuredSparsifier::test_prune_conv2d_activation_conv2d, test/test_ao_sparsity.py::TestBaseStructuredSparsifier::test_prune_conv2d_bias_conv2d, test/test_ao_sparsity.py::TestBaseStructuredSparsifier::test_prune_conv2d_conv2d, test/test_ao_sparsity.py::TestBaseStructuredSparsifier::test_prune_conv2d_padding_conv2d, test/test_ao_sparsity.py::TestBaseStructuredSparsifier::test_prune_conv2d_pool_conv2d, test/test_ao_sparsity.py::TestBaseStructuredSparsifier::test_prune_linear_activation_linear, test/test_ao_sparsity.py::TestBaseStructuredSparsifier::test_prune_linear_bias_linear, test/test_ao_sparsity.py::TestBaseStructuredSparsifier::test_prune_linear_linear, test/test_ao_sparsity.py::TestBaseStructuredSparsifier::test_prune_lstm_layernorm_linear_multiple_layer, test/test_ao_sparsity.py::TestBaseStructuredSparsifier::test_prune_lstm_layernorm_linear_single_layer, test/test_ao_sparsity.py::TestBaseStructuredSparsifier::test_prune_lstm_linear_multiple_layer, test/test_ao_sparsity.py::TestBaseStructuredSparsifier::test_prune_lstm_linear_single_layer, test/test_ao_sparsity.py::TestBaseStructuredSparsifier::test_step_conv2d, test/test_ao_sparsity.py::TestBaseStructuredSparsifier::test_step_linear, test/test_ao_sparsity.py::TestFPGMPruner::test_compute_distance, test/test_ao_sparsity.py::TestFPGMPruner::test_update_mask, test/test_ao_sparsity.py::TestSaliencyPruner::test_lstm_saliency_pruner_update_mask, test/test_ao_sparsity.py::TestSaliencyPruner::test_saliency_pruner_update_mask, test/test_ao_sparsity.py::TestComposability::test_convert_without_squash_mask, test/test_ao_sparsity.py::TestComposability::test_fusion_before_s_prep, test/test_ao_sparsity.py::TestComposability::test_q_prep_before_s_prep, test/test_ao_sparsity.py::TestComposability::test_qat_prep_before_s_prep, test/test_ao_sparsity.py::TestComposability::test_s_prep_before_fusion, test/test_ao_sparsity.py::TestComposability::test_s_prep_before_q_prep, test/test_ao_sparsity.py::TestComposability::test_s_prep_before_qat_prep, test/test_ao_sparsity.py::TestFxComposability::test_q_prep_fx_before_s_prep, test/test_ao_sparsity.py::TestFxComposability::test_q_prep_fx_s_prep_ref_conv, test/test_ao_sparsity.py::TestFxComposability::test_s_prep_before_q_prep_fx, test/test_ao_sparsity.py::TestFxComposability::test_s_prep_before_qat_prep_fx, test/test_ao_sparsity.py::TestFxComposability::test_s_prep_q_prep_fx_ref, test/test_ao_sparsity.py::TestActivationSparsifier::test_activation_sparsifier, test/test_ao_sparsity.py::TestBaseDataScheduler::test_constructor, test/test_ao_sparsity.py::TestBaseDataScheduler::test_order_of_steps, test/test_ao_sparsity.py::TestBaseDataScheduler::test_state_dict, test/test_ao_sparsity.py::TestBaseDataScheduler::test_step, test/test_ao_sparsity.py::TestBaseDataSparsifier::test_nn_embeddings, test/test_ao_sparsity.py::TestBaseDataSparsifier::test_nn_parameters, test/test_ao_sparsity.py::TestBaseDataSparsifier::test_tensors, test/test_ao_sparsity.py::TestNormDataSparsifiers::test_nn_embeddings, test/test_ao_sparsity.py::TestNormDataSparsifiers::test_nn_parameters, test/test_ao_sparsity.py::TestNormDataSparsifiers::test_tensors, test/test_ao_sparsity.py::TestQuantizationUtils::test_ptq_quantize_first, test/test_ao_sparsity.py::TestQuantizationUtils::test_ptq_sparsify_first, test/test_ao_sparsity.py::TestSparsityUtilFunctions::test_fqn_to_module, test/test_ao_sparsity.py::TestSparsityUtilFunctions::test_fqn_to_module_fail, test/test_ao_sparsity.py::TestSparsityUtilFunctions::test_fqn_to_module_for_tensors, test/test_ao_sparsity.py::TestSparsityUtilFunctions::test_get_arg_info_from_tensor_fqn, test/test_ao_sparsity.py::TestSparsityUtilFunctions::test_get_arg_info_from_tensor_fqn_fail, test/test_ao_sparsity.py::TestSparsityUtilFunctions::test_module_to_fqn, test/test_ao_sparsity.py::TestSparsityUtilFunctions::test_module_to_fqn_fail, test/test_ao_sparsity.py::TestSparsityUtilFunctions::test_module_to_fqn_root 2025-08-14T22:27:03.5971028Z 2025-08-14T22:27:07.7472414Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:27:07.7474968Z import pkg_resources 2025-08-14T22:27:08.1373650Z Running test_license 1/1 ... [2025-08-14 22:27:08.136842] 2025-08-14T22:27:08.1374231Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:27:08.1377110Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_license.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:27:08.137306] 2025-08-14T22:27:12.6600610Z 2025-08-14T22:27:12.6601882Z test_license 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_license_1.1_b42e3f3f0e8a4875_.log 2025-08-14T22:27:12.6603392Z Running 2 items in this shard: test/test_license.py::TestLicense::test_distinfo_license, test/test_license.py::TestLicense::test_license_for_wheel 2025-08-14T22:27:12.6604471Z 2025-08-14T22:27:16.8740960Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:27:16.8743389Z import pkg_resources 2025-08-14T22:27:17.2645656Z Running torch_np/test_binary_ufuncs 1/1 ... [2025-08-14 22:27:17.264070] 2025-08-14T22:27:17.2646103Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:27:17.2649433Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/test_binary_ufuncs.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:27:17.264488] 2025-08-14T22:27:21.9379479Z 2025-08-14T22:27:21.9380952Z torch_np/test_binary_ufuncs 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.test_binary_ufuncs_1.1_94af6d5d3e2e4855_.log 2025-08-14T22:27:21.9397841Z Running 38 items in this shard: test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_add, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_arctan2, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_bitwise_and, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_bitwise_or, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_bitwise_xor, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_copysign, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_divide, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_equal, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_float_power, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_floor_divide, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_fmax, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_fmin, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_fmod, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_gcd, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_greater, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_greater_equal, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_heaviside, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_hypot, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_lcm, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_ldexp, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_left_shift, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_less, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_less_equal, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_logaddexp, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_logaddexp2, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_logical_and, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_logical_or, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_logical_xor, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_matmul, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_maximum, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_minimum, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_multiply, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_nextafter, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_not_equal, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_power, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_remainder, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_right_shift, test/torch_np/test_binary_ufuncs.py::TestBinaryUfuncBasic::test_subtract 2025-08-14T22:27:21.9416269Z 2025-08-14T22:27:26.0721673Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:27:26.0723329Z import pkg_resources 2025-08-14T22:27:26.4664669Z Running torch_np/test_unary_ufuncs 1/1 ... [2025-08-14 22:27:26.465916] 2025-08-14T22:27:26.4665210Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:27:26.4668240Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/test_unary_ufuncs.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:27:26.466357] 2025-08-14T22:27:26.6857676Z 2025-08-14T22:27:26.6858428Z test_transformers 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_transformers_1.1_3eb8e83083563ca4_.log 2025-08-14T22:27:27.5847566Z Running 12380 items in this shard: test/test_transformers.py::TestTransformersCUDA::test_bias_is_none_cuda, test/test_transformers.py::TestTransformersCUDA::test_decoder_only_layer_cuda, test/test_transformers.py::TestTransformersCUDA::test_decoder_padding_and_src_mask_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_disable_fastpath_cuda, test/test_transformers.py::TestTransformersCUDA::test_encoder_is_causal_cuda, test/test_transformers.py::TestTransformersCUDA::test_encoder_padding_and_src_mask_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_is_causal_gpu_cuda, test/test_transformers.py::TestTransformersCUDA::test_kpm_mask_trailing_column_with_nested_tensor_cuda, test/test_transformers.py::TestTransformersCUDA::test_mask_check_fastpath_cuda, test/test_transformers.py::TestTransformersCUDA::test_math_backend_high_precision_cuda, test/test_transformers.py::TestTransformersCUDA::test_mha_native_args_nb_heads_1_bias_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_mha_native_args_nb_heads_1_bias_True_cuda, test/test_transformers.py::TestTransformersCUDA::test_mha_native_args_nb_heads_8_bias_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_mha_native_args_nb_heads_8_bias_True_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim2_key_padding_mask_dim1_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim2_key_padding_mask_dim1_float32_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim2_key_padding_mask_dim_2_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim2_key_padding_mask_dim_2_float32_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_2_key_padding_mask_dim1_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_2_key_padding_mask_dim1_float32_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_2_key_padding_mask_dim_2_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_2_key_padding_mask_dim_2_float32_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_3_key_padding_mask_dim1_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_3_key_padding_mask_dim1_float32_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_3_key_padding_mask_dim_2_bool_cuda, test/test_transformers.py::TestTransformersCUDA::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_3_key_padding_mask_dim_2_float32_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_2D_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_2D_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_2D_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_2D_causal_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_2D_causal_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_2D_causal_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_3D_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_3D_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_3D_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_3D_causal_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_3D_causal_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_3D_causal_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_no_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_no_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_3D_input_dim_no_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_2D_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_2D_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_2D_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_2D_causal_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_2D_causal_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_2D_causal_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_4D_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_4D_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_4D_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_4D_causal_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_4D_causal_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_4D_causal_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_no_attn_mask_dropout_p_0_0_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_no_attn_mask_dropout_p_0_2_cuda, test/test_transformers.py::TestTransformersCUDA::test_scaled_dot_product_attention_4D_input_dim_no_attn_mask_dropout_p_0_5_cuda, test/test_transformers.py::TestTransformersCUDA::test_script_encoder_subclass_cuda, test/test_transformers.py::TestTransformersCUDA::test_script_mha_in_proj_weight_none_cuda, test/test_transformers.py::TestTransformersCUDA::test_self_attn_TxT_attn_mask_cuda, test/test_transformers.py::TestTransformersCUDA::test_train_with_is_causal_cuda, test/test_transformers.py::TestTransformersCUDA::test_train_with_pad_and_catch_error_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformer_bias_is_none_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_False_training_False_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_False_training_False_enable_nested_tensor_True_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_False_training_True_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_False_training_True_enable_nested_tensor_True_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_True_training_False_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_True_training_False_enable_nested_tensor_True_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_True_training_True_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_batch_first_True_training_True_enable_nested_tensor_True_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_False_use_autocast_False_d_model_12_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_False_use_autocast_False_d_model_256_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_False_use_autocast_True_d_model_12_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_False_use_autocast_True_d_model_256_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_True_use_autocast_False_d_model_12_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_True_use_autocast_False_d_model_256_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_True_use_autocast_True_d_model_12_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_True_use_autocast_True_d_model_256_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_square_input_with_no_grad_False_training_False_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_square_input_with_no_grad_False_training_True_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_square_input_with_no_grad_True_training_False_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoder_square_input_with_no_grad_True_training_True_enable_nested_tensor_False_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoderlayer_no_fastpath_with_hooks_nhead_3_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoderlayer_no_fastpath_with_hooks_nhead_4_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoderlayer_src_mask_nhead_1_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoderlayer_src_mask_nhead_4_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoderlayer_src_mask_nhead_8_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoderlayer_subclass_cuda, test/test_transformers.py::TestTransformersCUDA::test_transformerencoderlayer_subclass_model_cuda, test/test_transformers.py::TestTransformersCUDA::test_with_nested_tensor_input_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_dispatch_fails_no_backend_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_atteention_large_bf16_nan_values_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_attention_fail_with_non_square_causal_attention_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_autocast_fp32_bfloat16_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_autocast_fp32_float16_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_backward_failure_sm86plus_head_dim_193_dropout_p_0_0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_backward_failure_sm86plus_head_dim_193_dropout_p_0_2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_backward_failure_sm86plus_head_dim_256_dropout_p_0_0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_backward_failure_sm86plus_head_dim_256_dropout_p_0_2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_flash_fail_fp32_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_fused_kernels_nested_broadcasting_error_cases_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_fused_kernels_nested_broadcasting_requires_grad_failure_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_fused_kernels_seq_len_0_inputs_fused_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_fused_kernels_seq_len_0_inputs_fused_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_attn_mask_present_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_broadcast_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_broadcast_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_broadcast_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_dim_3_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_dim_3_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_dim_3_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_head_dim_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_head_dim_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_head_dim_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_invalid_dtype_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_invalid_dtype_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_fused_inputs_invalid_dtype_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_1_dimensional_inputs_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_1_dimensional_inputs_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_1_dimensional_inputs_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_different_datatypes_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_different_datatypes_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_different_datatypes_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_different_devices_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_different_devices_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_inputs_different_devices_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_last_dim_stride_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_last_dim_stride_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_last_dim_stride_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_sdpa_kernel_grouped_query_attention_cuda_fused_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_sequence_lengths_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_sequence_lengths_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_invalid_sequence_lengths_kernel2_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_mask_invalid_last_dim_stride_kernel0_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_mask_invalid_last_dim_stride_kernel1_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_mem_eff_attention_fail_with_batch_size_geq_65536_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_mem_eff_attention_fail_with_batch_size_geq_65536_error_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_mem_eff_attention_large_seq_len_uniform_attention_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_mem_efficient_fail_bfloat16_less_than_sm80_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_nested_fails_on_padding_head_dim_cuda, test/test_transformers.py::TestSDPAFailureModesCUDA::test_unaligned_tensors_cuda, test/test_transformers.py::TestSDPACUDA::test_scaled_dot_product_attention_math_with_negative_scale_kernel0_cuda, test/test_transformers.py::TestSDPACUDA::test_sdp_math_gradcheck_contiguous_inputs_False_cuda, test/test_transformers.py::TestSDPACUDA::test_sdp_math_gradcheck_contiguous_inputs_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_d256_heuristic_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_different_dk_dv_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_fail_d128_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_gqa_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_nonmodulo64seqlen_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_preserves_query_layout_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_cudnn_attention_trivial_output_transpose_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_1_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_143_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_127_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_4_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_203_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_256_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_False_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale0_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_False_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_bfloat16_scale_l1_enable_gqa_True_n_heads1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale0_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_False_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_batch_size_8_seq_len_q_4_seq_len_k_579_head_dim_8_is_causal_True_dropout_p_0_48_float16_scale_l1_enable_gqa_True_n_heads1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_32_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_256_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_256_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_64_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_0_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale0_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_False_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_flash_attention_vs_math_ref_grads_nestedtensor_batch_size_8_max_seq_len_q_32_max_seq_len_kv_32_head_dim_8_dropout_p_0_1_float16_scale_l1_is_causal_True_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_different_dk_dv_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_1_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_1024_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_1024_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_32_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_False_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_0_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale0_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_attention_vs_math_ref_grads_cudagraph_batch_size_8_seq_len_q_256_seq_len_k_256_head_dim_64_is_causal_True_dropout_p_0_22_float16_scale_l1_fused_kernel2_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_backwards_throws_determinism_warning_fused_kernel0_warn_only_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_backwards_throws_determinism_warning_fused_kernel0_warn_only_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_backwards_throws_determinism_warning_fused_kernel1_warn_only_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_backwards_throws_determinism_warning_fused_kernel1_warn_only_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_backwards_throws_determinism_warning_fused_kernel2_warn_only_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_backwards_throws_determinism_warning_fused_kernel2_warn_only_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel0_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_False_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_False_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_False_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_False_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_False_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_kernel1_expand_q_batch_True_expand_k_batch_True_expand_v_batch_True_expand_q_num_heads_True_expand_k_num_heads_True_expand_v_num_heads_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_nested_broadcasting_query_dense_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_seq_len_1_inputs_fused_kernel0_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_kernels_seq_len_1_inputs_fused_kernel1_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_sdp_choice_type_dense_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_sdp_choice_type_nested_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_sdp_priority_order_use_compile_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_fused_sdp_priority_order_use_compile_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_eff_attention_long_sequence_mask_float16_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_eff_attention_long_sequence_mask_float32_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_eff_attention_non_contig_mask_bug_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_eff_attention_non_contiguous_mask_float16_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_eff_attention_non_contiguous_mask_float32_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_eff_backwards_determinism_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_312_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_408_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_attn_mask_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_mask_variants_mask_dim_1_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_mask_variants_mask_dim_2_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_mask_variants_mask_dim_3_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_mask_variants_mask_dim_4_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_1_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_1024_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_103_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_2048_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_1024_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_103_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_2048_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_128_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_16_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_8_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_False_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_0_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale0_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_bfloat16_scale_l1_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale0_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float16_scale_l1_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale0_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_mem_efficient_attention_vs_math_ref_grads_batch_size_8_seq_len_q_8_seq_len_k_8_head_dim_96_is_causal_True_dropout_p_0_22_float32_scale_l1_cuda_float32, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_cudnn_nested_type_nested_is_contiguous_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_accuracy_type_dense_fused_kernel0_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_accuracy_type_dense_fused_kernel1_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_accuracy_type_nested_fused_kernel0_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_accuracy_type_nested_fused_kernel1_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_type_dense_is_contiguous_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_type_dense_is_contiguous_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_type_nested_is_contiguous_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_scaled_dot_product_attention_fused_kernels_packed_type_nested_is_contiguous_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_choice_with_determinism_warn_only_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_choice_with_determinism_warn_only_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_False_is_causal_False_bfloat16_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_False_is_causal_False_float16_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_False_is_causal_True_bfloat16_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_False_is_causal_True_float16_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_True_is_causal_False_bfloat16_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_True_is_causal_False_float16_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_True_is_causal_True_bfloat16_cuda_bfloat16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_flash_attention_grad_against_math_contiguous_inputs_True_is_causal_True_float16_cuda_float16, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_mem_efficient_grad_against_math_contiguous_inputs_False_is_causal_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_mem_efficient_grad_against_math_contiguous_inputs_False_is_causal_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_mem_efficient_grad_against_math_contiguous_inputs_True_is_causal_False_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_sdp_mem_efficient_grad_against_math_contiguous_inputs_True_is_causal_True_cuda, test/test_transformers.py::TestSDPACudaOnlyCUDA::test_singelton_head_dim_stride_ne_1_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_LOWER_RIGHT_shape0_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_LOWER_RIGHT_shape1_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_LOWER_RIGHT_shape2_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_LOWER_RIGHT_shape3_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_UPPER_LEFT_shape0_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_UPPER_LEFT_shape1_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_UPPER_LEFT_shape2_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_causal_variant_CausalVariant_UPPER_LEFT_shape3_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_LOWER_RIGHT_shape0_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_LOWER_RIGHT_shape1_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_LOWER_RIGHT_shape2_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_LOWER_RIGHT_shape3_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_UPPER_LEFT_shape0_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_UPPER_LEFT_shape1_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_UPPER_LEFT_shape2_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_causal_variants_compile_causal_variant_CausalVariant_UPPER_LEFT_shape3_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_is_causal_and_mask_fails_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_is_causal_equals_upper_left_shape0_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_is_causal_equals_upper_left_shape1_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_is_causal_equals_upper_left_shape2_cuda, test/test_transformers.py::TestAttnBiasCUDA::test_is_causal_equals_upper_left_shape3_cuda 2025-08-14T22:27:28.4700557Z 2025-08-14T22:27:30.8357653Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:27:30.8358991Z import pkg_resources 2025-08-14T22:27:31.1913726Z 2025-08-14T22:27:31.1914508Z torch_np/test_unary_ufuncs 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.test_unary_ufuncs_1.1_cae29295b5303a0d_.log 2025-08-14T22:27:31.1924761Z Running 42 items in this shard: test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_absolute, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_arccos, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_arccosh, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_arcsin, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_arcsinh, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_arctan, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_arctanh, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_cbrt, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_ceil, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_conjugate, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_cos, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_cosh, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_deg2rad, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_degrees, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_exp, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_exp2, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_expm1, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_fabs, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_floor, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_isfinite, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_isinf, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_isnan, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_log, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_log10, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_log1p, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_log2, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_logical_not, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_negative, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_positive, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_rad2deg, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_radians, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_reciprocal, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_rint, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_sign, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_signbit, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_sin, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_sinh, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_sqrt, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_square, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_tan, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_tanh, test/torch_np/test_unary_ufuncs.py::TestUnaryUfuncs::test_trunc 2025-08-14T22:27:31.1934414Z 2025-08-14T22:27:31.2155767Z Running test_foreach 1/1 ... [2025-08-14 22:27:31.215088] 2025-08-14T22:27:31.2156147Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:27:31.2159965Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_foreach.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:27:31.215507] 2025-08-14T22:27:35.2835317Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:27:35.2837504Z import pkg_resources 2025-08-14T22:27:35.6585584Z Running test_expanded_weights 1/1 ... [2025-08-14 22:27:35.657990] 2025-08-14T22:27:35.6586042Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:27:35.6588114Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_expanded_weights.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:27:35.658439] 2025-08-14T22:27:41.7880327Z 2025-08-14T22:27:41.7881286Z test_expanded_weights 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_expanded_weights_1.1_4975a78c4a2c19ca_.log 2025-08-14T22:27:41.7981188Z Running 220 items in this shard: test/test_expanded_weights.py::TestExpandedWeightHelperFunctionCUDA::test_forward_helper_cuda, test/test_expanded_weights.py::TestExpandedWeightHelperFunctionCUDA::test_forward_helper_failure_args_cuda, test/test_expanded_weights.py::TestExpandedWeightHelperFunctionCUDA::test_set_grad_sample_if_exists_cuda, test/test_expanded_weights.py::TestExpandedWeightHelperFunctionCUDA::test_set_grad_sample_if_exists_failure_cuda, test/test_expanded_weights.py::TestExpandedWeightHelperFunctionCUDA::test_sum_over_all_but_batch_and_last_n_cuda, test/test_expanded_weights.py::TestExpandedWeightHelperFunctionCUDA::test_unpack_expanded_weight_or_tensor_cuda, test/test_expanded_weights.py::TestExpandedWeightHelperFunctionCUDA::test_unpack_expanded_weight_or_tensor_failure_cuda, test/test_expanded_weights.py::TestExpandedWeightHelperFunctionCUDA::test_unpack_expanded_weight_or_tensor_with_custom_function_cuda, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_cnn_model_mean_cuda, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_cnn_model_sum_cuda, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_embedding_model_cuda, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_error_cuda, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_conv1d_cuda_bfloat16, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_conv1d_cuda_complex128, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_conv1d_cuda_complex32, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_conv1d_cuda_complex64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_conv1d_cuda_float16, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_conv1d_cuda_float32, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_conv1d_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_conv2d_cuda_bfloat16, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_conv2d_cuda_complex128, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_conv2d_cuda_complex32, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_conv2d_cuda_complex64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_conv2d_cuda_float16, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_conv2d_cuda_float32, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_conv2d_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_conv3d_cuda_bfloat16, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_conv3d_cuda_complex128, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_conv3d_cuda_complex32, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_conv3d_cuda_complex64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_conv3d_cuda_float16, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_conv3d_cuda_float32, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_conv3d_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_embedding_cuda_bfloat16, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_embedding_cuda_float16, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_embedding_cuda_float32, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_embedding_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_group_norm_cuda_bfloat16, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_group_norm_cuda_float16, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_group_norm_cuda_float32, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_group_norm_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_instance_norm_cuda_bfloat16, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_instance_norm_cuda_float16, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_instance_norm_cuda_float32, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_instance_norm_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_layer_norm_cuda_bfloat16, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_layer_norm_cuda_float16, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_layer_norm_cuda_float32, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_layer_norm_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_linear_cuda_bfloat16, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_linear_cuda_complex128, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_linear_cuda_complex64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_linear_cuda_float16, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_linear_cuda_float32, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_forward_nn_functional_linear_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_per_sample_grad_mean_nn_functional_conv1d_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_per_sample_grad_mean_nn_functional_conv2d_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_per_sample_grad_mean_nn_functional_conv3d_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_per_sample_grad_mean_nn_functional_embedding_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_per_sample_grad_mean_nn_functional_group_norm_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_per_sample_grad_mean_nn_functional_instance_norm_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_per_sample_grad_mean_nn_functional_layer_norm_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_per_sample_grad_mean_nn_functional_linear_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_per_sample_grad_sum_nn_functional_conv1d_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_per_sample_grad_sum_nn_functional_conv2d_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_per_sample_grad_sum_nn_functional_conv3d_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_per_sample_grad_sum_nn_functional_embedding_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_per_sample_grad_sum_nn_functional_group_norm_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_per_sample_grad_sum_nn_functional_instance_norm_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_per_sample_grad_sum_nn_functional_layer_norm_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weight_per_sample_grad_sum_nn_functional_linear_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weights_per_sample_grad_input_no_grad_nn_functional_conv1d_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weights_per_sample_grad_input_no_grad_nn_functional_conv2d_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weights_per_sample_grad_input_no_grad_nn_functional_conv3d_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weights_per_sample_grad_input_no_grad_nn_functional_embedding_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weights_per_sample_grad_input_no_grad_nn_functional_group_norm_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weights_per_sample_grad_input_no_grad_nn_functional_instance_norm_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weights_per_sample_grad_input_no_grad_nn_functional_layer_norm_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_expanded_weights_per_sample_grad_input_no_grad_nn_functional_linear_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_group_norm_error_cuda, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_group_norm_model_num_dim_1_cuda, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_group_norm_model_num_dim_2_cuda, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_group_norm_model_num_dim_3_cuda, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_instance_norm_model_num_dim_1_cuda, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_instance_norm_model_num_dim_2_cuda, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_instance_norm_model_num_dim_3_cuda, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_layer_norm_model_num_dim_1_cuda, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_layer_norm_model_num_dim_2_cuda, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_layer_norm_model_num_dim_3_cuda, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_unsupported_expand_weights_nn_functional_conv1d_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_unsupported_expand_weights_nn_functional_conv2d_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_unsupported_expand_weights_nn_functional_conv3d_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_unsupported_expand_weights_nn_functional_embedding_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_unsupported_expand_weights_nn_functional_group_norm_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_unsupported_expand_weights_nn_functional_instance_norm_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_unsupported_expand_weights_nn_functional_layer_norm_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightFunctionalCUDA::test_unsupported_expand_weights_nn_functional_linear_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_circular_stride2_pad2_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_circular_stride2_pad2_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_circular_stride2_pad2_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_pad1_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_pad1_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_pad1_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_pad1size1_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_pad1size1_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_pad1size1_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_pad2_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_pad2_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_pad2_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_pad2size1_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_pad2size1_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_pad2size1_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_reflect_stride2_pad2_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_reflect_stride2_pad2_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_reflect_stride2_pad2_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_replicate_stride2_pad2_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_replicate_stride2_pad2_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_replicate_stride2_pad2_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_stride_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_stride_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_stride_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_zero_batch_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_zero_batch_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_zero_batch_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_zeros_stride2_pad2_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_zeros_stride2_pad2_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv1d_zeros_stride2_pad2_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_circular_stride2_pad2_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_circular_stride2_pad2_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_circular_stride2_pad2_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_dilated_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_dilated_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_dilated_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_no_bias_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_no_bias_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_no_bias_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_padding_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_padding_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_padding_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_reflect_stride2_pad2_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_reflect_stride2_pad2_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_reflect_stride2_pad2_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_replicate_stride2_pad2_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_replicate_stride2_pad2_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_replicate_stride2_pad2_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_strided_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_strided_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_strided_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_zero_batch_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_zero_batch_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_zero_batch_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_zeros_stride2_pad2_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_zeros_stride2_pad2_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv2d_zeros_stride2_pad2_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_1x1x1_no_bias_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_1x1x1_no_bias_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_1x1x1_no_bias_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_circular_stride2_pad2_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_circular_stride2_pad2_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_circular_stride2_pad2_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_no_bias_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_no_bias_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_no_bias_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_replicate_stride2_pad2_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_replicate_stride2_pad2_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_replicate_stride2_pad2_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_stride_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_stride_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_stride_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_stride_padding_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_stride_padding_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_stride_padding_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_zero_batch_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_zero_batch_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_zero_batch_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_zeros_stride2_pad2_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_zeros_stride2_pad2_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Conv3d_zeros_stride2_pad2_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Embedding_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Embedding_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Embedding_discontiguous_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Embedding_discontiguous_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Embedding_discontiguous_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Embedding_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_LayerNorm_3d_no_affine_large_feature_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_LayerNorm_3d_no_affine_large_feature_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_LayerNorm_3d_no_affine_large_feature_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Linear_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Linear_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Linear_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Linear_no_batch_dim_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Linear_no_batch_dim_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Linear_no_batch_dim_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Linear_no_bias_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Linear_no_bias_cuda_double_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_Linear_no_bias_multiple_inputs_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_module_nn_GRU_eval_mode_cuda_float32, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_module_nn_GRU_eval_mode_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_module_nn_GRU_train_mode_cuda_float32, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_module_nn_GRU_train_mode_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_module_nn_LSTM_eval_mode_cuda_float32, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_module_nn_LSTM_eval_mode_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_module_nn_LSTM_train_mode_cuda_float32, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_module_nn_LSTM_train_mode_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_module_nn_RNN_eval_mode_cuda_float32, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_module_nn_RNN_eval_mode_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_module_nn_RNN_train_mode_cuda_float32, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_module_nn_RNN_train_mode_cuda_float64, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_per_sample_api_compute_batch_size_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_per_sample_api_compute_batch_size_not_pytreeable_cuda, test/test_expanded_weights.py::TestExpandedWeightModuleCUDA::test_per_sample_api_failing_cuda 2025-08-14T22:27:41.8072601Z 2025-08-14T22:27:45.8451081Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:27:45.8452419Z import pkg_resources 2025-08-14T22:27:46.2198912Z Running typing/test_python_operators 1/1 ... [2025-08-14 22:27:46.219402] 2025-08-14T22:27:46.2199572Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:27:46.2201882Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'typing/test_python_operators.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:27:46.219805] 2025-08-14T22:27:46.8615103Z 2025-08-14T22:27:46.8615739Z test_foreach 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_foreach_1.1_0ddcf304dd3b0413_.log 2025-08-14T22:27:46.9881605Z Running 3577 items in this shard: test/test_foreach.py::TestForeachCUDA::test_0dim_tensor_overload_cpu_ok_cuda, test/test_foreach.py::TestForeachCUDA::test_0dim_tensor_overload_exception_cuda, test/test_foreach.py::TestForeachCUDA::test_add_scalar_with_empty_list_and_empty_tensor_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_add_scalar_with_empty_list_and_empty_tensor_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_add_scalar_with_empty_list_and_empty_tensor_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_add_scalar_with_empty_list_and_empty_tensor_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_add_scalar_with_empty_list_and_empty_tensor_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_add_scalar_with_empty_list_and_empty_tensor_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_add_scalar_with_empty_list_and_empty_tensor_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_add_scalar_with_empty_list_and_empty_tensor_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_add_scalar_with_empty_list_and_empty_tensor_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_add_scalar_with_empty_list_and_empty_tensor_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_add_scalar_with_empty_list_and_empty_tensor_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_abs_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_acos_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_add_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_addcdiv_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_addcmul_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_asin_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_atan_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_ceil_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_clamp_max_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_clamp_min_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_copy_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_cos_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_cosh_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_div_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_erf_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_erfc_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_exp_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_expm1_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_floor_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_frac_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_lerp_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_lgamma_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_log10_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_log1p_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_log2_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_log_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_max_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_maximum_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_minimum_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_mul_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_neg_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_norm_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_pow_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_reciprocal_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_round_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_rsqrt_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_sigmoid_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_sign_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_sin_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_sinh_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_sqrt_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_sub_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_tan_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_tanh_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_trunc_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_all_zero_size_tensors_do_not_launch_kernel__foreach_zero_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_abs_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_abs_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_abs_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_abs_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_acos_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_acos_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_acos_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_acos_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_add_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_add_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_add_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_add_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_addcdiv_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_addcdiv_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_addcdiv_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_addcdiv_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_addcmul_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_addcmul_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_addcmul_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_addcmul_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_asin_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_asin_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_asin_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_asin_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_atan_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_atan_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_atan_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_atan_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_ceil_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_ceil_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_ceil_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_ceil_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_clamp_max_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_clamp_max_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_clamp_max_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_clamp_max_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_clamp_min_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_clamp_min_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_clamp_min_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_clamp_min_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_copy_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_copy_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_copy_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_copy_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_cos_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_cos_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_cos_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_cos_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_cosh_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_cosh_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_cosh_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_cosh_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_div_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_div_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_div_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_div_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_erf_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_erf_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_erf_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_erf_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_erfc_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_erfc_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_erfc_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_erfc_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_exp_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_exp_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_exp_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_exp_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_expm1_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_expm1_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_expm1_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_expm1_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_floor_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_floor_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_floor_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_floor_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_frac_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_frac_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_frac_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_frac_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_lerp_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_lerp_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_lerp_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_lerp_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_lgamma_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_lgamma_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_lgamma_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_lgamma_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_log10_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_log10_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_log10_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_log10_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_log1p_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_log1p_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_log1p_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_log1p_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_log2_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_log2_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_log2_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_log2_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_log_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_log_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_log_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_log_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_max_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_max_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_max_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_max_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_maximum_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_maximum_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_maximum_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_maximum_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_minimum_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_minimum_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_minimum_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_minimum_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_mul_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_mul_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_mul_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_mul_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_neg_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_neg_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_neg_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_neg_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_norm_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_norm_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_norm_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_norm_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_pow_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_pow_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_pow_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_pow_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_reciprocal_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_reciprocal_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_reciprocal_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_reciprocal_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_round_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_round_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_round_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_round_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_rsqrt_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_rsqrt_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_rsqrt_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_rsqrt_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sigmoid_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sigmoid_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sigmoid_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sigmoid_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sign_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sign_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sign_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sign_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sin_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sin_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sin_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sin_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sinh_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sinh_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sinh_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sinh_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sqrt_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sqrt_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sqrt_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sqrt_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sub_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sub_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sub_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_sub_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_tan_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_tan_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_tan_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_tan_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_tanh_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_tanh_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_tanh_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_tanh_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_trunc_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_trunc_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_trunc_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_trunc_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_zero_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_zero_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_zero_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_autodiff__foreach_zero_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_big_num_tensors__foreach_max_use_cuda_graph_False_w_empty_False_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_big_num_tensors__foreach_max_use_cuda_graph_False_w_empty_False_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_big_num_tensors__foreach_max_use_cuda_graph_False_w_empty_True_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_big_num_tensors__foreach_max_use_cuda_graph_False_w_empty_True_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_big_num_tensors__foreach_max_use_cuda_graph_True_w_empty_False_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_big_num_tensors__foreach_max_use_cuda_graph_True_w_empty_False_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_big_num_tensors__foreach_max_use_cuda_graph_True_w_empty_True_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_big_num_tensors__foreach_max_use_cuda_graph_True_w_empty_True_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_big_num_tensors__foreach_norm_use_cuda_graph_False_w_empty_False_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_big_num_tensors__foreach_norm_use_cuda_graph_False_w_empty_False_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_big_num_tensors__foreach_norm_use_cuda_graph_False_w_empty_True_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_big_num_tensors__foreach_norm_use_cuda_graph_False_w_empty_True_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_big_num_tensors__foreach_norm_use_cuda_graph_True_w_empty_False_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_big_num_tensors__foreach_norm_use_cuda_graph_True_w_empty_False_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_big_num_tensors__foreach_norm_use_cuda_graph_True_w_empty_True_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_big_num_tensors__foreach_norm_use_cuda_graph_True_w_empty_True_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_add_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_add_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_add_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_add_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_clamp_max_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_clamp_max_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_clamp_max_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_clamp_max_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_clamp_min_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_clamp_min_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_clamp_min_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_clamp_min_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_div_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_div_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_div_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_div_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_maximum_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_maximum_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_maximum_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_maximum_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_minimum_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_minimum_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_minimum_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_minimum_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_mul_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_mul_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_mul_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_mul_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_pow_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_pow_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_pow_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_pow_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_sub_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_sub_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_sub_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_float_inf_nan__foreach_sub_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_add_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_add_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_add_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_add_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_add_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_add_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_add_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_add_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_add_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_add_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_add_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_add_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_max_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_max_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_max_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_max_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_max_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_max_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_max_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_max_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_max_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_max_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_max_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_max_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_min_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_min_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_min_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_min_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_min_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_min_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_min_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_min_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_min_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_min_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_min_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_clamp_min_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_div_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_div_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_div_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_div_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_div_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_div_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_div_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_div_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_div_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_div_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_div_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_div_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_maximum_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_maximum_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_maximum_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_maximum_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_maximum_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_maximum_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_maximum_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_maximum_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_maximum_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_maximum_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_maximum_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_maximum_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_minimum_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_minimum_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_minimum_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_minimum_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_minimum_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_minimum_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_minimum_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_minimum_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_minimum_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_minimum_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_minimum_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_minimum_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_mul_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_mul_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_mul_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_mul_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_mul_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_mul_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_mul_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_mul_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_mul_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_mul_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_mul_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_mul_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_pow_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_pow_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_pow_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_pow_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_pow_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_pow_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_pow_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_pow_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_pow_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_pow_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_pow_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_pow_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_sub_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_sub_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_sub_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_sub_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_sub_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_sub_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_sub_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_sub_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_sub_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_sub_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_sub_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_error_cases__foreach_sub_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_add_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_add_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_add_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_add_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_add_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_add_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_add_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_add_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_add_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_add_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_add_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_add_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_max_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_max_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_max_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_max_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_max_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_max_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_max_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_max_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_max_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_max_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_max_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_max_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_min_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_min_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_min_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_min_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_min_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_min_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_min_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_min_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_min_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_min_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_min_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_clamp_min_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_div_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_div_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_div_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_div_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_div_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_div_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_div_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_div_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_div_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_div_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_div_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_div_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_maximum_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_maximum_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_maximum_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_maximum_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_maximum_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_maximum_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_maximum_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_maximum_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_maximum_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_maximum_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_maximum_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_maximum_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_minimum_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_minimum_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_minimum_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_minimum_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_minimum_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_minimum_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_minimum_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_minimum_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_minimum_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_minimum_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_minimum_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_minimum_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_mul_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_mul_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_mul_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_mul_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_mul_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_mul_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_mul_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_mul_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_mul_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_mul_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_mul_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_mul_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_pow_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_pow_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_pow_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_pow_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_pow_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_pow_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_pow_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_pow_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_pow_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_pow_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_pow_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_pow_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_sub_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_sub_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_sub_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_sub_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_sub_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_sub_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_sub_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_sub_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_sub_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_sub_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_sub_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_list_slow_path__foreach_sub_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_different_tensor_dtypes__foreach_add_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_different_tensor_dtypes__foreach_clamp_max_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_different_tensor_dtypes__foreach_clamp_min_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_different_tensor_dtypes__foreach_div_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_different_tensor_dtypes__foreach_maximum_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_different_tensor_dtypes__foreach_minimum_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_different_tensor_dtypes__foreach_mul_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_different_tensor_dtypes__foreach_pow_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_different_tensor_dtypes__foreach_sub_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_add_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_add_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_add_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_add_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_add_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_add_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_add_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_add_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_add_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_add_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_add_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_add_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_max_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_max_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_max_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_max_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_max_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_max_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_max_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_max_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_max_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_max_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_max_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_max_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_min_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_min_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_min_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_min_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_min_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_min_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_min_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_min_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_min_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_min_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_min_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_clamp_min_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_div_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_div_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_div_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_div_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_div_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_div_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_div_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_div_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_div_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_div_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_div_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_div_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_maximum_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_maximum_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_maximum_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_maximum_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_maximum_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_maximum_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_maximum_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_maximum_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_maximum_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_maximum_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_maximum_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_maximum_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_minimum_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_minimum_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_minimum_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_minimum_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_minimum_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_minimum_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_minimum_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_minimum_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_minimum_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_minimum_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_minimum_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_minimum_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_mul_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_mul_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_mul_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_mul_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_mul_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_mul_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_mul_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_mul_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_mul_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_mul_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_mul_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_mul_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_pow_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_pow_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_pow_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_pow_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_pow_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_pow_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_pow_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_pow_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_pow_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_pow_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_pow_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_pow_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_sub_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_sub_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_sub_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_sub_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_sub_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_sub_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_sub_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_sub_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_sub_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_sub_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_sub_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_scalar_with_overlapping_tensors__foreach_sub_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_add_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_add_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_add_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_add_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_add_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_add_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_add_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_add_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_add_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_add_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_add_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_add_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_max_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_max_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_max_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_max_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_max_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_max_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_max_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_max_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_max_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_max_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_max_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_max_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_min_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_min_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_min_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_min_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_min_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_min_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_min_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_min_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_min_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_min_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_min_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_clamp_min_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_div_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_div_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_div_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_div_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_div_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_div_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_div_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_div_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_div_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_div_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_div_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_div_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_maximum_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_maximum_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_maximum_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_maximum_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_maximum_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_maximum_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_maximum_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_maximum_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_maximum_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_maximum_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_maximum_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_maximum_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_minimum_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_minimum_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_minimum_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_minimum_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_minimum_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_minimum_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_minimum_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_minimum_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_minimum_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_minimum_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_minimum_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_minimum_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_mul_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_mul_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_mul_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_mul_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_mul_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_mul_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_mul_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_mul_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_mul_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_mul_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_mul_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_mul_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_pow_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_pow_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_pow_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_pow_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_pow_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_pow_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_pow_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_pow_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_pow_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_pow_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_pow_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_pow_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_sub_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_sub_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_sub_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_sub_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_sub_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_sub_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_sub_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_sub_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_sub_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_sub_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_sub_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_tensors_on_different_devices__foreach_sub_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_False_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_False_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_False_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_False_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_False_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_False_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_False_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_False_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_False_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_False_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_False_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_False_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_div_reciprocal_cuda, test/test_foreach.py::TestForeachCUDA::test_foreach_check_stride_ignore_dims_of_one_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_different_device_inputs__foreach_copy_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_different_device_inputs__foreach_copy_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_different_device_inputs__foreach_copy_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_different_device_inputs__foreach_copy_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_different_device_inputs__foreach_copy_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_different_device_inputs__foreach_copy_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_different_device_inputs__foreach_copy_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_different_device_inputs__foreach_copy_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_different_device_inputs__foreach_copy_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_different_device_inputs__foreach_copy_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_different_device_inputs__foreach_copy_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_different_device_inputs__foreach_copy_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_device_inputs__foreach_copy_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_device_inputs__foreach_copy_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_device_inputs__foreach_copy_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_device_inputs__foreach_copy_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_device_inputs__foreach_copy_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_device_inputs__foreach_copy_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_device_inputs__foreach_copy_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_device_inputs__foreach_copy_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_device_inputs__foreach_copy_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_device_inputs__foreach_copy_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_device_inputs__foreach_copy_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_device_inputs__foreach_copy_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_foreach_copy_with_multi_dtypes_large_input_cuda, test/test_foreach.py::TestForeachCUDA::test_foreach_l2_large_value_input__foreach_norm_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_foreach_l2_large_value_input__foreach_norm_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_True_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_True_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_True_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_True_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_True_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_True_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_True_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_True_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_True_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_True_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_True_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_max_w_empty_True_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_False_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_False_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_False_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_False_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_False_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_False_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_False_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_False_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_False_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_False_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_False_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_False_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_True_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_True_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_True_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_True_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_True_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_True_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_True_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_True_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_True_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_True_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_True_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_foreach_reduce_large_input__foreach_norm_w_empty_True_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_abs_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_acos_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_add_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_addcdiv_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_addcmul_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_asin_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_atan_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_ceil_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_clamp_max_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_clamp_min_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_copy_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_cos_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_cosh_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_div_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_erf_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_erfc_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_exp_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_expm1_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_floor_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_frac_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_lerp_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_lgamma_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_log10_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_log1p_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_log2_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_log_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_maximum_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_minimum_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_mul_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_neg_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_pow_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_reciprocal_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_round_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_rsqrt_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_sigmoid_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_sign_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_sin_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_sinh_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_sqrt_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_sub_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_tan_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_tanh_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_trunc_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_inplace_foreach_leaf_check_and_grad_fn__foreach_zero_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_lifetime_of_grad_fn_when_result_is_saved__foreach_exp_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_lifetime_of_grad_fn_when_result_is_saved__foreach_expm1_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_lifetime_of_grad_fn_when_result_is_saved__foreach_pow_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_lifetime_of_grad_fn_when_result_is_saved__foreach_reciprocal_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_lifetime_of_grad_fn_when_result_is_saved__foreach_rsqrt_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_lifetime_of_grad_fn_when_result_is_saved__foreach_sigmoid_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_lifetime_of_grad_fn_when_result_is_saved__foreach_sqrt_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_lifetime_of_grad_fn_when_result_is_saved__foreach_tan_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_lifetime_of_grad_fn_when_result_is_saved__foreach_tanh_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_abs_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_acos_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_add_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_addcdiv_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_addcmul_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_asin_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_atan_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_ceil_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_clamp_max_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_clamp_min_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_cos_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_cosh_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_div_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_erf_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_erfc_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_exp_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_expm1_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_floor_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_frac_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_lerp_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_lgamma_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_log10_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_log1p_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_log2_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_log_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_maximum_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_minimum_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_mul_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_neg_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_pow_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_reciprocal_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_round_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_rsqrt_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_sigmoid_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_sign_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_sin_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_sinh_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_sqrt_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_sub_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_tan_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_tanh_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_outplace_with_invalid_grads__foreach_trunc_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_abs_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_acos_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_add_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcdiv_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_addcmul_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_asin_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_atan_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_ceil_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_max_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_clamp_min_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_copy_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cos_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_cosh_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_div_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erf_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_erfc_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_exp_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_expm1_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_floor_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_frac_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lerp_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_lgamma_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log10_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log1p_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log2_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_log_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_max_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_maximum_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_minimum_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_mul_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_neg_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_norm_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_pow_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_reciprocal_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_round_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_rsqrt_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sigmoid_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sign_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sin_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sinh_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sqrt_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_sub_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tan_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_tanh_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_trunc_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_fastpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_inplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_inplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_inplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_inplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_inplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_inplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_inplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_inplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_inplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_inplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_inplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_inplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_outplace_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_outplace_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_outplace_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_outplace_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_outplace_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_outplace_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_outplace_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_outplace_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_outplace_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_outplace_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_outplace_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_parity__foreach_zero_slowpath_outplace_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_tensors_on_different_devices__foreach_addcdiv_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_tensors_on_different_devices__foreach_addcdiv_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_tensors_on_different_devices__foreach_addcmul_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_tensors_on_different_devices__foreach_addcmul_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_False_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_False_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_False_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_False_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_False_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_False_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_False_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_False_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_False_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_False_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_False_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_False_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_True_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_True_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_True_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_True_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_True_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_True_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_True_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_True_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_True_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_True_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_True_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcdiv_is_fastpath_True_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_False_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_False_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_False_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_False_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_False_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_False_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_False_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_False_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_False_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_False_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_False_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_False_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_True_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_True_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_True_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_True_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_True_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_True_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_True_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_True_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_True_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_True_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_True_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_pointwise_op_with_tensor_of_scalarlist_overload__foreach_addcmul_is_fastpath_True_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_tensors_grouping_cuda, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_abs_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_abs_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_abs_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_abs_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_abs_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_abs_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_abs_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_abs_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_abs_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_abs_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_abs_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_abs_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_acos_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_acos_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_acos_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_acos_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_acos_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_acos_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_acos_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_acos_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_acos_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_acos_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_acos_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_acos_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_asin_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_asin_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_asin_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_asin_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_asin_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_asin_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_asin_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_asin_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_asin_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_asin_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_asin_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_asin_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_atan_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_atan_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_atan_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_atan_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_atan_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_atan_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_atan_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_atan_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_atan_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_atan_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_atan_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_atan_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_ceil_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_ceil_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_ceil_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_ceil_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_ceil_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_ceil_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_ceil_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_ceil_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_ceil_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_ceil_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_ceil_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_ceil_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cos_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cos_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cos_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cos_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cos_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cos_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cos_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cos_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cos_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cos_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cos_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cos_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cosh_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cosh_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cosh_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cosh_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cosh_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cosh_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cosh_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cosh_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cosh_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cosh_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cosh_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_cosh_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erf_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erf_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erf_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erf_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erf_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erf_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erf_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erf_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erf_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erf_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erf_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erf_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erfc_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erfc_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erfc_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erfc_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erfc_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erfc_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erfc_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erfc_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erfc_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erfc_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erfc_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_erfc_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_exp_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_exp_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_exp_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_exp_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_exp_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_exp_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_exp_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_exp_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_exp_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_exp_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_exp_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_exp_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_expm1_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_expm1_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_expm1_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_expm1_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_expm1_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_expm1_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_expm1_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_expm1_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_expm1_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_expm1_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_expm1_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_expm1_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_floor_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_floor_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_floor_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_floor_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_floor_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_floor_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_floor_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_floor_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_floor_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_floor_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_floor_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_floor_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_frac_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_frac_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_frac_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_frac_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_frac_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_frac_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_frac_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_frac_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_frac_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_frac_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_frac_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_frac_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_lgamma_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_lgamma_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_lgamma_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_lgamma_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_lgamma_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_lgamma_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_lgamma_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_lgamma_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_lgamma_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_lgamma_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_lgamma_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_lgamma_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log10_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log10_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log10_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log10_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log10_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log10_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log10_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log10_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log10_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log10_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log10_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log10_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log1p_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log1p_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log1p_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log1p_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log1p_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log1p_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log1p_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log1p_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log1p_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log1p_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log1p_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log1p_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log2_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log2_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log2_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log2_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log2_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log2_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log2_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log2_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log2_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log2_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log2_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log2_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_log_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_neg_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_neg_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_neg_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_neg_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_neg_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_neg_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_neg_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_neg_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_neg_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_neg_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_neg_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_neg_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_reciprocal_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_reciprocal_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_reciprocal_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_reciprocal_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_reciprocal_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_reciprocal_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_reciprocal_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_reciprocal_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_reciprocal_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_reciprocal_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_reciprocal_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_reciprocal_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_round_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_round_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_round_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_round_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_round_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_round_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_round_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_round_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_round_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_round_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_round_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_round_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_rsqrt_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_rsqrt_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_rsqrt_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_rsqrt_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_rsqrt_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_rsqrt_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_rsqrt_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_rsqrt_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_rsqrt_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_rsqrt_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_rsqrt_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_rsqrt_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sigmoid_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sigmoid_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sigmoid_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sigmoid_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sigmoid_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sigmoid_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sigmoid_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sigmoid_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sigmoid_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sigmoid_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sigmoid_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sigmoid_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sign_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sign_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sign_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sign_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sign_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sign_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sign_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sign_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sign_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sign_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sign_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sign_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sin_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sin_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sin_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sin_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sin_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sin_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sin_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sin_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sin_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sin_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sin_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sin_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sinh_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sinh_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sinh_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sinh_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sinh_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sinh_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sinh_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sinh_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sinh_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sinh_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sinh_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sinh_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sqrt_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sqrt_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sqrt_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sqrt_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sqrt_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sqrt_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sqrt_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sqrt_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sqrt_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sqrt_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sqrt_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_sqrt_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tan_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tan_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tan_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tan_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tan_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tan_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tan_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tan_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tan_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tan_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tan_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tan_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tanh_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tanh_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tanh_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tanh_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tanh_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tanh_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tanh_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tanh_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tanh_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tanh_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tanh_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_tanh_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_trunc_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_trunc_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_trunc_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_trunc_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_trunc_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_trunc_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_trunc_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_trunc_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_trunc_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_trunc_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_trunc_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_trunc_cuda_uint8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_zero_cuda_bfloat16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_zero_cuda_bool, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_zero_cuda_complex128, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_zero_cuda_complex64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_zero_cuda_float16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_zero_cuda_float32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_zero_cuda_float64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_zero_cuda_int16, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_zero_cuda_int32, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_zero_cuda_int64, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_zero_cuda_int8, test/test_foreach.py::TestForeachCUDA::test_unary_op_tensors_on_different_devices__foreach_zero_cuda_uint8 2025-08-14T22:27:47.1113288Z 2025-08-14T22:27:50.9472422Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:27:50.9473781Z import pkg_resources 2025-08-14T22:27:51.3341874Z Running test_fx_experimental 1/1 ... [2025-08-14 22:27:51.333610] 2025-08-14T22:27:51.3342321Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:27:51.3343645Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_fx_experimental.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:27:51.334020] 2025-08-14T22:27:51.3956781Z 2025-08-14T22:27:51.3957975Z typing/test_python_operators 1/1 was successful, full logs can be found in artifacts with path test/test-reports/typing.test_python_operators_1.1_89efbe02bd836992_.log 2025-08-14T22:27:51.4064594Z Running 318 items in this shard: test/typing/test_python_operators.py::TestPythonOperators::test_binary_a100_op_%_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a101_op_%_b101, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a102_op_%_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a103_op_%_b103, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a104_op_*_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a105_op_*_b105, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a106_op_*_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a107_op_*_b107, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a108_op_**_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a109_op_**_b109, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a110_op_**_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a111_op_**_b111, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a112_op_+_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a113_op_+_b113, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a114_op_+_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a115_op_+_b115, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a116_op_-_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a117_op_-_b117, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a118_op_-_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a119_op_-_b119, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a120_op_/_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a121_op_/_b121, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a122_op_/_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a123_op_/_b123, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a124_op_//_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a125_op_//_b125, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a126_op_//_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a127_op_//_b127, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a128_op_&_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a129_op_&_b129, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a130_op_&_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a131_op_&_b131, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a132_op_<<_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a133_op_<<_b133, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a134_op_<<_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a135_op_<<_b135, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a136_op_>>_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a137_op_>>_b137, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a138_op_>>_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a139_op_>>_b139, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a140_op_^_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a141_op_^_b141, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a142_op_^_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a143_op_^_b143, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a144_op_|_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a145_op_|_b145, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a146_op_|_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a147_op_|_b147, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a148_op_@_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a149_op_@_b149, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a150_op_@_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a151_op_@_b151, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a228_op_!=_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a229_op_!=_b229, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a230_op_!=_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a231_op_!=_b231, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a232_op_<_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a233_op_<_b233, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a234_op_<_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a235_op_<_b235, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a236_op_<=_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a237_op_<=_b237, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a238_op_<=_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a239_op_<=_b239, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a240_op_==_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a241_op_==_b241, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a242_op_==_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a243_op_==_b243, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a244_op_>_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a245_op_>_b245, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a246_op_>_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a247_op_>_b247, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a248_op_>=_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a249_op_>=_b249, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a250_op_>=_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a251_op_>=_b251, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a252_op_%_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a253_op_%_b253, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a254_op_%_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a255_op_%_b255, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a256_op_*_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a257_op_*_b257, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a258_op_*_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a259_op_*_b259, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a260_op_**_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a261_op_**_b261, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a262_op_**_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a263_op_**_b263, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a264_op_+_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a265_op_+_b265, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a266_op_+_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a267_op_+_b267, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a268_op_-_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a269_op_-_b269, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a270_op_-_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a271_op_-_b271, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a272_op_/_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a273_op_/_b273, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a274_op_/_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a275_op_/_b275, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a276_op_//_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a277_op_//_b277, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a278_op_//_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a279_op_//_b279, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a280_op_&_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a281_op_&_b281, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a282_op_&_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a283_op_&_b283, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a284_op_<<_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a285_op_<<_b285, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a286_op_<<_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a287_op_<<_b287, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a288_op_>>_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a289_op_>>_b289, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a290_op_>>_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a291_op_>>_b291, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a292_op_^_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a293_op_^_b293, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a294_op_^_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a295_op_^_b295, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a296_op_|_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a297_op_|_b297, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a298_op_|_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a299_op_|_b299, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a300_op_@_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a301_op_@_b301, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a302_op_@_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a303_op_@_b303, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a76_op_!=_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a77_op_!=_b77, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a78_op_!=_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a79_op_!=_b79, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a80_op_<_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a81_op_<_b81, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a82_op_<_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a83_op_<_b83, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a84_op_<=_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a85_op_<=_b85, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a86_op_<=_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a87_op_<=_b87, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a88_op_==_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a89_op_==_b89, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a90_op_==_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a91_op_==_b91, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a92_op_>_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a93_op_>_b93, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a94_op_>_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a95_op_>_b95, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a96_op_>=_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a97_op_>=_b97, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a98_op_>=_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a99_op_>=_b99, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_!=_b1, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_!=_b3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_!=_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_!=_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_%_b25, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_%_b27, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_%_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_%_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_&_b53, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_&_b55, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_&_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_&_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_**_b33, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_**_b35, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_**_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_**_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_*_b29, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_*_b31, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_*_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_*_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_+_b37, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_+_b39, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_+_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_+_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_-_b41, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_-_b43, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_-_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_-_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_//_b49, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_//_b51, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_//_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_//_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_/_b45, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_/_b47, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_/_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_/_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_<<_b57, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_<<_b59, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_<<_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_<<_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_<=_b11, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_<=_b9, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_<=_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_<=_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_<_b5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_<_b7, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_<_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_<_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_==_b13, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_==_b15, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_==_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_==_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_>=_b21, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_>=_b23, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_>=_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_>=_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_>>_b61, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_>>_b63, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_>>_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_>>_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_>_b17, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_>_b19, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_>_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_>_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_@_b73, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_@_b75, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_@_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_@_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_^_b65, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_^_b67, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_^_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_^_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_|_b69, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_|_b71, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_|_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_1_5_op_|_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_!=_b153, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_!=_b155, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_!=_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_!=_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_%_b177, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_%_b179, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_%_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_%_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_&_b205, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_&_b207, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_&_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_&_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_**_b185, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_**_b187, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_**_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_**_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_*_b181, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_*_b183, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_*_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_*_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_+_b189, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_+_b191, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_+_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_+_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_-_b193, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_-_b195, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_-_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_-_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_//_b201, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_//_b203, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_//_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_//_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_/_b197, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_/_b199, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_/_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_/_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_<<_b209, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_<<_b211, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_<<_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_<<_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_<=_b161, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_<=_b163, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_<=_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_<=_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_<_b157, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_<_b159, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_<_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_<_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_==_b165, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_==_b167, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_==_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_==_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_>=_b173, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_>=_b175, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_>=_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_>=_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_>>_b213, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_>>_b215, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_>>_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_>>_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_>_b169, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_>_b171, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_>_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_>_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_@_b225, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_@_b227, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_@_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_@_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_^_b217, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_^_b219, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_^_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_^_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_|_b221, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_|_b223, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_|_b_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_binary_a_3_op_|_b_3, test/typing/test_python_operators.py::TestPythonOperators::test_operators_are_correct_and_complete, test/typing/test_python_operators.py::TestPythonOperators::test_type_tests_are_complete, test/typing/test_python_operators.py::TestPythonOperators::test_unary_op_+_a1, test/typing/test_python_operators.py::TestPythonOperators::test_unary_op_+_a3, test/typing/test_python_operators.py::TestPythonOperators::test_unary_op_+_a_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_unary_op_+_a_3, test/typing/test_python_operators.py::TestPythonOperators::test_unary_op_-_a5, test/typing/test_python_operators.py::TestPythonOperators::test_unary_op_-_a7, test/typing/test_python_operators.py::TestPythonOperators::test_unary_op_-_a_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_unary_op_-_a_3, test/typing/test_python_operators.py::TestPythonOperators::test_unary_op_~_a11, test/typing/test_python_operators.py::TestPythonOperators::test_unary_op_~_a9, test/typing/test_python_operators.py::TestPythonOperators::test_unary_op_~_a_1_5, test/typing/test_python_operators.py::TestPythonOperators::test_unary_op_~_a_3 2025-08-14T22:27:51.4158399Z 2025-08-14T22:27:55.5203833Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:27:55.5205250Z import pkg_resources 2025-08-14T22:27:55.9043781Z Running test_file_check 1/1 ... [2025-08-14 22:27:55.903849] 2025-08-14T22:27:55.9044199Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:27:55.9045703Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_file_check.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:27:55.904251] 2025-08-14T22:28:00.6285505Z 2025-08-14T22:28:00.6286246Z test_file_check 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_file_check_1.1_e1bcccea08b53b5a_.log 2025-08-14T22:28:00.6287275Z Running 2 items in this shard: test/test_file_check.py::TestFileCheck::test_all_python_api, test/test_file_check.py::TestFileCheck::test_not_run 2025-08-14T22:28:00.6287811Z 2025-08-14T22:28:05.0075407Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:28:05.0076756Z import pkg_resources 2025-08-14T22:28:05.3851784Z Running torch_np/test_dtype 1/1 ... [2025-08-14 22:28:05.384656] 2025-08-14T22:28:05.3852340Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:28:05.3854762Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/test_dtype.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:28:05.385098] 2025-08-14T22:28:06.9345935Z 2025-08-14T22:28:06.9347552Z test_fx_experimental 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_fx_experimental_1.1_aa802a4a4851545d_.log 2025-08-14T22:28:06.9669048Z Running 723 items in this shard: test/test_fx_experimental.py::TestFXExperimental::test_annotate_getitem_node, test/test_fx_experimental.py::TestFXExperimental::test_annotate_returns_with_schema, test/test_fx_experimental.py::TestFXExperimental::test_aot_based_partition, test/test_fx_experimental.py::TestFXExperimental::test_call_to_assert_no_msg, test/test_fx_experimental.py::TestFXExperimental::test_call_to_assert_with_empty_msg, test/test_fx_experimental.py::TestFXExperimental::test_call_to_assert_with_msg, test/test_fx_experimental.py::TestFXExperimental::test_call_to_assert_with_multiline_message, test/test_fx_experimental.py::TestFXExperimental::test_conv_bn_fusion, test/test_fx_experimental.py::TestFXExperimental::test_conv_bn_fusion_mixed_dtype, test/test_fx_experimental.py::TestFXExperimental::test_conv_bn_fusion_not_running_state, test/test_fx_experimental.py::TestFXExperimental::test_cost_aware_partition, test/test_fx_experimental.py::TestFXExperimental::test_fetch, test/test_fx_experimental.py::TestFXExperimental::test_find_single_partition, test/test_fx_experimental.py::TestFXExperimental::test_lack_of_devices, test/test_fx_experimental.py::TestFXExperimental::test_large_node_error, test/test_fx_experimental.py::TestFXExperimental::test_merge_matmuls, test/test_fx_experimental.py::TestFXExperimental::test_meta_tracer, test/test_fx_experimental.py::TestFXExperimental::test_normalize_args, test/test_fx_experimental.py::TestFXExperimental::test_normalize_args_perserve_type, test/test_fx_experimental.py::TestFXExperimental::test_normalize_args_preserve_meta, test/test_fx_experimental.py::TestFXExperimental::test_normalize_binary_operators, test/test_fx_experimental.py::TestFXExperimental::test_normalize_modules_exhaustive, test/test_fx_experimental.py::TestFXExperimental::test_optimize_for_inference_cpu, test/test_fx_experimental.py::TestFXExperimental::test_optimize_for_inference_cpu_torchvision, test/test_fx_experimental.py::TestFXExperimental::test_partition_device_mapping, test/test_fx_experimental.py::TestFXExperimental::test_partition_latency, test/test_fx_experimental.py::TestFXExperimental::test_partition_node_manipulation, test/test_fx_experimental.py::TestFXExperimental::test_replace_target_nodes_with, test/test_fx_experimental.py::TestFXExperimental::test_saturate_host, test/test_fx_experimental.py::TestFXExperimental::test_size_based_partition, test/test_fx_experimental.py::TestFXExperimental::test_sparse_nn_partition, test/test_fx_experimental.py::TestFXExperimental::test_split_module_dead_code, test/test_fx_experimental.py::TestFXExperimental::test_split_module_default_arg, test/test_fx_experimental.py::TestFXExperimental::test_split_module_input_names, test/test_fx_experimental.py::TestFXExperimental::test_split_module_keep_original_order_and_noop_graph, test/test_fx_experimental.py::TestFXExperimental::test_split_module_kwargs_expansion, test/test_fx_experimental.py::TestFXExperimental::test_split_module_return_node, test/test_fx_experimental.py::TestFXExperimental::test_split_module_symint_dependency_handling, test/test_fx_experimental.py::TestFXExperimental::test_split_qualname_mapping, test/test_fx_experimental.py::TestFXExperimental::test_subgraph_creation, test/test_fx_experimental.py::TestFXExperimental::test_subgraph_trivial_resnet, test/test_fx_experimental.py::TestFXExperimental::test_subgraph_uniquename, test/test_fx_experimental.py::TestFXExperimental::test_to_folder, test/test_fx_experimental.py::TestFXExperimental::test_traceable_function_with_nonstandard_name, test/test_fx_experimental.py::TestFXExperimental::test_type_matches, test/test_fx_experimental.py::TestTranslationValidation::test_sat, test/test_fx_experimental.py::TestTranslationValidation::test_sat_bitwise, test/test_fx_experimental.py::TestTranslationValidation::test_sympy_to_z3, test/test_fx_experimental.py::TestTranslationValidation::test_unsat, test/test_fx_experimental.py::TestTranslationValidation::test_z3str, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_args_op_overload_cuda, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_H_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_T_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive___getitem___cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive___radd___cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive___rdiv___cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive___rmatmul___cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive___rmod___cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive___rmul___cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive___rpow___cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive___rsub___cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive__batch_norm_with_update_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive__chunk_cat_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive__native_batch_norm_legit_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive__segment_reduce_lengths_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive__segment_reduce_offsets_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive__softmax_backward_data_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive__unsafe_masked_index_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive__unsafe_masked_index_put_accumulate_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive__upsample_bilinear2d_aa_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_abs_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_acos_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_acosh_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_add_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_addbmm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_addcdiv_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_addcmul_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_addmm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_addmm_decomposed_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_addmv_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_addr_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_alias_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_all_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_allclose_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_amax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_amin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_aminmax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_angle_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_any_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_arange_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_argmax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_argmin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_argsort_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_argwhere_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_as_strided_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_as_strided_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_as_strided_partial_views_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_as_strided_scatter_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_asin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_asinh_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_atan2_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_atan_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_atanh_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_atleast_1d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_atleast_2d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_atleast_3d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_baddbmm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_bernoulli_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_bfloat16_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_block_diag_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_bmm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_bool_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_broadcast_shapes_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_broadcast_tensors_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_broadcast_to_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_bucketize_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_byte_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cartesian_prod_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cat_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cauchy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cdist_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cdouble_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_ceil_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cfloat_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_chalf_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_char_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cholesky_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cholesky_inverse_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cholesky_solve_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_chunk_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_clamp_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_clamp_max_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_clamp_min_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_clone_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_column_stack_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_combinations_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_complex_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_conj_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_conj_physical_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_constant_pad_nd_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_contiguous_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_copysign_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_corrcoef_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cos_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cosh_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_count_nonzero_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cov_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cross_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cummax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cummin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cumprod_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cumsum_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_cumulative_trapezoid_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_deg2rad_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_diag_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_diag_embed_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_diagflat_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_diagonal_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_diagonal_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_diagonal_scatter_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_diff_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_digamma_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_dist_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_div_floor_rounding_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_div_no_rounding_mode_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_div_trunc_rounding_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_dot_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_double_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_dsplit_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_dstack_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_einsum_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_empty_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_empty_like_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_empty_permuted_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_empty_strided_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_eq_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_equal_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_erf_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_erfc_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_erfinv_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_exp2_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_exp_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_expand_as_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_expand_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_expand_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_expm1_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_exponential_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_eye_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_fft2_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_fft_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_fftn_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_fftshift_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_hfft2_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_hfft_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_hfftn_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_ifft2_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_ifft_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_ifftn_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_ifftshift_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_ihfft2_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_ihfft_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_ihfftn_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_irfft2_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_irfft_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_irfftn_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_rfft2_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_rfft_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fft_rfftn_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fill_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_flatten_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_flip_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fliplr_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_flipud_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_float_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_float_power_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_floor_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_floor_divide_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fmax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fmin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_fmod_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_frac_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_frexp_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_full_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_full_like_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_gather_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_ge_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_geometric_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_geqrf_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_gradient_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_grid_sampler_2d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_gt_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_half_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_hash_tensor_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_heaviside_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_histc_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_hsplit_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_hstack_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_hypot_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_i0_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_igamma_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_igammac_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_index_add_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_index_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_index_fill_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_index_put_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_index_reduce_amax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_index_reduce_amin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_index_reduce_mean_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_index_reduce_prod_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_index_select_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_inner_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_int_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_isclose_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_isfinite_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_isin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_isinf_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_isnan_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_isneginf_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_isposinf_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_isreal_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_item_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_jiterator_2inputs_2outputs_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_jiterator_4inputs_with_extra_args_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_jiterator_binary_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_jiterator_binary_return_by_ref_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_jiterator_unary_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_kron_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_kthvalue_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_ldexp_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_le_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_lerp_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_lgamma_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_cholesky_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_cholesky_ex_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_cond_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_cross_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_det_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_diagonal_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_eig_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_eigh_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_eigvals_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_eigvalsh_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_householder_product_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_inv_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_inv_ex_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_ldl_factor_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_ldl_factor_ex_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_ldl_solve_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_lstsq_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_lstsq_grad_oriented_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_lu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_lu_factor_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_lu_factor_ex_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_lu_solve_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_matrix_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_matrix_power_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_matrix_rank_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_matrix_rank_hermitian_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_multi_dot_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_norm_subgradients_at_zero_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_pinv_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_pinv_hermitian_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_pinv_singular_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_qr_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_slogdet_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_solve_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_solve_ex_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_solve_triangular_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_svd_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_svdvals_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_tensorinv_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_tensorsolve_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_vander_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_vecdot_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linalg_vector_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linspace_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_linspace_tensor_overload_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_log10_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_log1p_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_log2_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_log_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_log_normal_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_log_softmax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_log_softmax_with_dtype_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logaddexp2_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logaddexp_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logcumsumexp_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logdet_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logical_and_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logical_not_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logical_or_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logical_xor_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logit_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logspace_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logspace_tensor_overload_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_logsumexp_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_long_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_lt_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_lu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_lu_solve_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_lu_unpack_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_mH_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_mT_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_amax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_amin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_argmax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_argmin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_cumprod_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_cumsum_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_fill_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_log_softmax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_logaddexp_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_logsumexp_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_mean_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_median_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_normalize_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_prod_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_scatter_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_select_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_softmax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_softmin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_std_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_sum_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_masked_var_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_matmul_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_matrix_exp_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_max_binary_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_max_pool2d_with_indices_backward_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_max_reduction_no_dim_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_max_reduction_with_dim_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_maximum_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_mean_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_median_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_meshgrid_list_of_tensors_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_meshgrid_variadic_tensors_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_min_binary_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_min_reduction_no_dim_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_min_reduction_with_dim_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_minimum_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_mm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_mode_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_movedim_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_msort_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_mul_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_multinomial_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_mv_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nan_to_num_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nanmean_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nanmedian_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nanquantile_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nansum_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_narrow_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_narrow_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_native_batch_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_native_dropout_backward_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_native_layer_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_ne_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_neg_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_new_empty_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_new_empty_strided_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_new_full_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_new_ones_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_new_zeros_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nextafter_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_alpha_dropout_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_avg_pool1d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_avg_pool2d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_avg_pool3d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_batch_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_bilinear_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_binary_cross_entropy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_celu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_channel_shuffle_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_conv1d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_conv2d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_conv3d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_conv_transpose1d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_conv_transpose2d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_conv_transpose3d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_cosine_embedding_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_cosine_similarity_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_cross_entropy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_ctc_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_dropout2d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_dropout3d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_dropout_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_elu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_embedding_bag_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_embedding_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_fractional_max_pool2d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_fractional_max_pool3d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_gaussian_nll_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_gelu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_glu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_grid_sample_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_group_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_hardshrink_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_hardsigmoid_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_hardswish_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_hardtanh_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_hinge_embedding_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_huber_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_instance_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_interpolate_area_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_interpolate_bicubic_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_interpolate_bilinear_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_interpolate_linear_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_interpolate_nearest_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_interpolate_trilinear_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_kl_div_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_l1_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_layer_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_leaky_relu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_linear_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_local_response_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_logsigmoid_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_margin_ranking_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_max_pool1d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_max_pool2d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_max_pool3d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_max_unpool1d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_max_unpool1d_grad_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_max_unpool2d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_max_unpool2d_grad_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_max_unpool3d_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_max_unpool3d_grad_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_mish_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_mse_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_multi_head_attention_forward_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_multi_margin_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_multilabel_margin_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_nll_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_normalize_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_pad_circular_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_pad_constant_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_pad_reflect_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_pad_replicate_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_pad_replicate_negative_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_pairwise_distance_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_pdist_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_pixel_shuffle_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_pixel_unshuffle_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_poisson_nll_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_prelu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_relu6_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_relu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_rms_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_rrelu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_selu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_silu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_smooth_l1_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_soft_margin_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_softmin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_softmin_with_dtype_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_softplus_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_softshrink_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_softsign_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_tanhshrink_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_threshold_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_triplet_margin_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_unfold_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_upsample_bilinear_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nn_functional_upsample_nearest_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nonzero_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_nonzero_static_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_norm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_norm_fro_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_norm_inf_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_norm_nuc_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_normal_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_normal_in_place_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_normal_number_mean_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_ones_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_ones_like_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_ormqr_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_outer_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_pca_lowrank_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_permute_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_permute_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_pinverse_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_polar_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_polygamma_polygamma_n_0_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_polygamma_polygamma_n_1_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_polygamma_polygamma_n_2_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_polygamma_polygamma_n_3_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_polygamma_polygamma_n_4_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_positive_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_pow_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_prod_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_put_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_qr_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_quantile_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_rad2deg_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_rand_like_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_randint_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_randint_like_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_randn_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_randn_like_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_ravel_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_real_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_reciprocal_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_remainder_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_renorm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_repeat_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_repeat_interleave_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_reshape_as_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_reshape_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_resize__cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_resize_as__cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_resolve_conj_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_resolve_neg_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_roll_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_rot90_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_round_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_round_decimals_0_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_round_decimals_3_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_round_decimals_neg_3_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_rsqrt_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_rsub_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_scalar_tensor_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_scatter_add_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_scatter_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_scatter_reduce_amax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_scatter_reduce_amin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_scatter_reduce_mean_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_scatter_reduce_prod_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_scatter_reduce_sum_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_searchsorted_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_select_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_select_scatter_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sgn_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_short_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sigmoid_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sign_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signal_windows_bartlett_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signal_windows_blackman_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signal_windows_cosine_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signal_windows_exponential_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signal_windows_gaussian_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signal_windows_general_cosine_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signal_windows_general_hamming_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signal_windows_hamming_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signal_windows_hann_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signal_windows_kaiser_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signal_windows_nuttall_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_signbit_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sin_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sinc_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sinh_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_slice_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_slice_scatter_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_softmax_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_softmax_with_dtype_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sort_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sparse_mm_reduce_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sparse_sampled_addmm_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_airy_ai_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_bessel_j0_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_bessel_j1_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_bessel_y0_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_bessel_y1_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_chebyshev_polynomial_t_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_chebyshev_polynomial_u_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_chebyshev_polynomial_v_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_chebyshev_polynomial_w_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_entr_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_erfcx_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_hermite_polynomial_h_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_hermite_polynomial_he_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_i0e_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_i1_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_i1e_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_laguerre_polynomial_l_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_legendre_polynomial_p_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_log_ndtr_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_modified_bessel_i0_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_modified_bessel_i1_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_modified_bessel_k0_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_modified_bessel_k1_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_ndtr_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_ndtri_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_scaled_modified_bessel_k0_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_scaled_modified_bessel_k1_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_spherical_bessel_j0_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_xlog1py_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_special_zeta_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_split_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_split_list_args_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_split_with_sizes_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_split_with_sizes_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sqrt_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_square_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_squeeze_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_squeeze_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_squeeze_multiple_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_stack_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_std_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_std_mean_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_std_mean_unbiased_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_std_unbiased_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_stft_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sub_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sum_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_sum_to_size_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_svd_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_svd_lowrank_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_t_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_t_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_take_along_dim_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_take_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_tan_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_tanh_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_tensor_split_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_tensordot_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_tile_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_to_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_to_sparse_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_topk_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_trace_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_transpose_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_transpose_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_trapezoid_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_trapz_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_triangular_solve_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_tril_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_triu_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_true_divide_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_trunc_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_unbind_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_unbind_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_unflatten_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_unfold_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_unfold_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_uniform_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_unique_consecutive_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_unique_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_unsafe_chunk_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_unsafe_split_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_unsqueeze_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_unsqueeze_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_var_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_var_mean_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_var_mean_unbiased_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_var_unbiased_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_vdot_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_view_as_complex_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_view_as_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_view_copy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_view_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_vsplit_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_vstack_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_where_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_xlogy_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_zero__cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_zeros_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_operator_exhaustive_zeros_like_cuda_float32, test/test_fx_experimental.py::TestNormalizeOperatorsCUDA::test_normalize_quantized_eb_cuda 2025-08-14T22:28:06.9976997Z 2025-08-14T22:28:10.0591814Z 2025-08-14T22:28:10.0592793Z torch_np/test_dtype 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.test_dtype_1.1_1bef8a76633557a4_.log 2025-08-14T22:28:10.0606281Z Running 44 items in this shard: test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'bool_', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'complex128', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'complex64', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'float16', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'float32', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'float64', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'int16', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'int32', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'int64', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'int8', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'uint16', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'uint32', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'uint64', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_'uint8', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_bool, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'bool_', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'complex128', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'complex64', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'float16', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'float32', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'float64', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'int16', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'int32', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'int64', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'int8', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'uint16', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'uint32', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'uint64', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.'uint8', test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.bool_, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.complex128, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.complex64, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.dtype('bool'), test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.float16, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.float32, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.float64, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.int16, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.int32, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.int64, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.int8, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.uint16, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.uint32, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.uint64, test/torch_np/test_dtype.py::TestConvertDType::test_convert_np_dtypes_numpy.uint8 2025-08-14T22:28:10.0619053Z 2025-08-14T22:28:11.1586665Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:28:11.1588875Z import pkg_resources 2025-08-14T22:28:11.5693541Z Running higher_order_ops/test_with_effects 1/1 ... [2025-08-14 22:28:11.568863] 2025-08-14T22:28:11.5694180Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:28:11.5701077Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'higher_order_ops/test_with_effects.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:28:11.569763] 2025-08-14T22:28:14.1874934Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:28:14.1876258Z import pkg_resources 2025-08-14T22:28:14.5643329Z Running test_native_functions 1/1 ... [2025-08-14 22:28:14.563769] 2025-08-14T22:28:14.5643771Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:28:14.5645469Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_native_functions.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:28:14.564154] 2025-08-14T22:28:19.1388527Z 2025-08-14T22:28:19.1389528Z test_native_functions 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_native_functions_1.1_5c9875f3a132f40e_.log 2025-08-14T22:28:19.1393826Z Running 11 items in this shard: test/test_native_functions.py::TestNativeFunctions::test_intlist_error_with_overload, test/test_native_functions.py::TestNativeFunctions::test_optional_filled_intlist, test/test_native_functions.py::TestNativeFunctions::test_optional_floatlist, test/test_native_functions.py::TestNativeFunctions::test_optional_floatlist_invalid, test/test_native_functions.py::TestNativeFunctions::test_optional_intlist, test/test_native_functions.py::TestNativeFunctions::test_optional_intlist_invalid, test/test_native_functions.py::TestNativeFunctions::test_string_defaults, test/test_native_functions.py::TestNativeFunctions::test_symintlist_error, test/test_native_functions.py::TestNativeFunctions::test_symintlist_error_with_overload, test/test_native_functions.py::TestNativeFunctions::test_symintlist_error_with_overload_but_is_unique, test/test_native_functions.py::TestNativeFunctions::test_vararg_symintlist_error 2025-08-14T22:28:19.1397165Z 2025-08-14T22:28:23.4050526Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:28:23.4052014Z import pkg_resources 2025-08-14T22:28:23.7911673Z Running functorch/test_minifier 1/1 ... [2025-08-14 22:28:23.790571] 2025-08-14T22:28:23.7912153Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:28:23.7913324Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_minifier.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:28:23.790957] 2025-08-14T22:28:28.6159950Z 2025-08-14T22:28:28.6161338Z functorch/test_minifier 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_minifier_1.1_c710732635af4530_.log 2025-08-14T22:28:28.6163226Z Running 5 items in this shard: test/functorch/test_minifier.py::TestMinifier::test_has_add_mul, test/functorch/test_minifier.py::TestMinifier::test_has_mul_minifier, test/functorch/test_minifier.py::TestMinifier::test_input_returned, test/functorch/test_minifier.py::TestMinifier::test_module, test/functorch/test_minifier.py::TestMinifier::test_tup_use 2025-08-14T22:28:28.6164627Z 2025-08-14T22:28:28.8795424Z 2025-08-14T22:28:28.8796720Z higher_order_ops/test_with_effects 1/1 was successful, full logs can be found in artifacts with path test/test-reports/higher_order_ops.test_with_effects_1.1_216deb300ab15c33_.log 2025-08-14T22:28:28.8804535Z Running 18 items in this shard: test/higher_order_ops/test_with_effects.py::TestWithEffects::test_alias_op, test/higher_order_ops/test_with_effects.py::TestWithEffects::test_compile_aot_eager, test/higher_order_ops/test_with_effects.py::TestWithEffects::test_compile_aot_eager_requires_grad, test/higher_order_ops/test_with_effects.py::TestWithEffects::test_compile_inductor, test/higher_order_ops/test_with_effects.py::TestWithEffects::test_compile_inductor_external_op_return_none, test/higher_order_ops/test_with_effects.py::TestWithEffects::test_effectful_custom_op_with_subclasses, test/higher_order_ops/test_with_effects.py::TestWithEffects::test_effectful_op_in_backward, test/higher_order_ops/test_with_effects.py::TestWithEffects::test_effects_and_aliased_outputs, test/higher_order_ops/test_with_effects.py::TestWithEffects::test_effects_and_input_mutation_is_output, test/higher_order_ops/test_with_effects.py::TestWithEffects::test_effects_and_input_mutation_return, test/higher_order_ops/test_with_effects.py::TestWithEffects::test_effects_and_input_output_view_simple, test/higher_order_ops/test_with_effects.py::TestWithEffects::test_print, test/higher_order_ops/test_with_effects.py::TestWithEffects::test_print_with_buffer_mutations, test/higher_order_ops/test_with_effects.py::TestWithEffects::test_print_with_input_mutations, test/higher_order_ops/test_with_effects.py::TestWithEffects::test_register_effectful_custom_op, test/higher_order_ops/test_with_effects.py::TestWithEffects::test_regular_effectful_op_in_forward_and_backward, test/higher_order_ops/test_with_effects.py::TestWithEffects::test_regular_effectful_op_only_in_backward, test/higher_order_ops/test_with_effects.py::TestWithEffects::test_torchbind_custom_op 2025-08-14T22:28:28.8810807Z 2025-08-14T22:28:32.4700710Z 2025-08-14T22:28:32.4701789Z inductor/test_compiled_autograd 1/2 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_compiled_autograd_1.2_564c2bde9d62815d_.log 2025-08-14T22:28:32.4885988Z Running 423 items in this shard: test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_accumulate_grad_polyfill_case_1_1, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_accumulate_grad_polyfill_case_1_2, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_accumulate_grad_polyfill_case_1_3, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_accumulate_grad_polyfill_case_1_5_2, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_accumulate_grad_polyfill_case_3_1, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_accumulate_grad_polyfill_case_3_2, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_anomaly_mode_already_nan, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_anomaly_mode_backward, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_anomaly_mode_grad, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_autograd_cpp_node_basic_is_traceable_True, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_autograd_cpp_node_data_dependent_is_traceable_True, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_autograd_cpp_node_id_is_traceable_True, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_autograd_cpp_node_non_traceable, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_autograd_cpp_node_saved_dynamic_is_traceable_True, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_autograd_cpp_node_saved_float_is_traceable_True, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_autograd_cpp_node_saved_int_is_traceable_False, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_autograd_cpp_node_saved_int_is_traceable_True, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_backward_hook_relative_ordering_partial, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_cache_hit, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_checkpointing_sac, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_checkpointing_simple_reentrant_False, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_checkpointing_simple_reentrant_True, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_compile_api_api_compile_backend_aot_eager, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_compile_api_api_compile_backend_eager, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_compile_api_api_compile_backend_inductor, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_compile_api_api_optimize_backend_aot_eager, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_compile_api_disable_api_compile_backend_eager, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_compile_api_disable_api_compile_backend_inductor, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_compiled_autograd_does_not_specialize_on_bw_symints, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_cpu_offloading, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_cudagraphs_cpu_graph, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_cudagraphs_cpu_scalar_used_in_cpp_custom_op, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_cudagraphs_cpu_scalar_used_in_python_custom_op, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_cudagraphs_sdpa, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_custom_fn_bw_graph_break, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_custom_fn_compiled_fw_bw_graph_break, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_custom_fn_dynamically_defined_class, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_custom_fn_multiple_grads, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_custom_fn_saved_attr, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_custom_fn_saved_multiple_tensors, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_custom_fn_saved_tensors, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_ddp_cpp_reducer_error, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_ddp_python_reducer, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_disk_offloading, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_dynamic_shapes_annotations, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_dynamic_shapes_eager_node, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_dynamo_boxed, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_flex_attention, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_free_activation_memory_subclass, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_higher_order_gradients, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_inplace_grad_update, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_inputs_aliasing_bytecode_stack_restore, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_issue106555, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_keep_graph_usage_after_compiled, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_logging_tensor_flaky, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_optimize_assert_backend_aot_eager, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_optimize_assert_backend_eager, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_optimize_assert_backend_inductor, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_output_nodes_all_leaves, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_reorder_multi_pre_hooks, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_reorder_multi_tensor_pre_hooks, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_reset, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_saved_tensor_unpack_hook_ordering, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_tensor_grad_hook1, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_tensor_grad_hook2, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_torch_compile_only_backward_call, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_torch_function_mode, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_trace_run_with_rng_state, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_verbose_logs_aot_dispatcher_nodes, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_verbose_logs_aot_dispatcher_nodes_hop, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_verbose_logs_cpp, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_verbose_logs_dynamic_shapes, test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_verbose_logs_snapshot, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_access_saved_tensor_twice_without_recomputation_works, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_accumulate_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_accumulate_grad_posthooks_can_observe_tensor_prehook, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_accumulate_grad_posthooks_should_not_execute, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_accumulate_grad_with_zero_numel_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_anomaly_assign_parent_cleanup, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_anomaly_detect_nan, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_anomaly_mode_no_check_nan, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_autograd_inplace_view_of_view, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_autograd_inplace_views_creation_meta, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_autograd_inplace_views_cross_dtype, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_autograd_multiple_views_python, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_autograd_simple_views_python, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_autograd_views_codegen, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_backward_badcalls, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_backward_copy, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_backward_create_graph_warns, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_backward_hook_relative_ordering, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_backward_no_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_backward_twice_retained_graph_with_saved_values, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_backward_twice_with_saved_values, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_backward_with_inputs, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_calculate_shape_util, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_callback_adds_callback, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_cant_create_saved_tensors, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_checkpoint_detects_non_determinism, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_checkpoint_valid_reset_on_error, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_checkpointing, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_checkpointing_without_reentrant_correct_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_checkpointing_without_reentrant_custom_function_works, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_checkpointing_without_reentrant_dataparallel, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_checkpointing_without_reentrant_detached_tensor_use_reentrant_False, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_checkpointing_without_reentrant_input_requires_grad_False, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_checkpointing_without_reentrant_input_requires_grad_True, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_checkpointing_without_reentrant_memory_savings, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_create_graph_and_full_backward_hook_cycle, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_current_graph_task_execution_order, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_autograd_no_early_free, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_autograd_repeated_grad_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_function_cycle, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_function_error, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_function_exception, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_function_forward_mode_non_differentiable, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_function_forward_mode_non_tensor_before_tensor_args, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_function_forward_mode_wrong_formula, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_function_mark_dirty_not_differentiable, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_function_preserve_torch_function_when_return_as_is, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_function_saved_tensors, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_function_setup_context_simple, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_custom_function_vmap_defaults, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_deep_reentrant, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_dep_nograd, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_dependent_backward, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_detach_base, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_detach_then_inplace_raises_in_autograd, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_disabling_saved_tensor_hooks, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_disabling_saved_tensor_hooks_nested, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_duplicate_backward_root, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_enable_grad_decorator_no_paren, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_first_grad_fn_access_in_no_grad_mode, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_free_deep_graph_complicated, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_free_deep_graph_pyfunction, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_function, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_batched_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_empty_inputs, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_fn_badcalls, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_fn_input_metadata, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_fn_prehooks, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_fn_prehooks_multiple_outputs, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_nonleaf, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_nonleaf_register_hook, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_to_node, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_to_node_inplace, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_to_node_materialize, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_grad_unreachable_discovery, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_check_batched_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_check_forward_or_backward_only, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_complex_non_complex_outputs, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_custom_error, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_dense_and_sparse_inputs, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_forward_ad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_forward_ad_respects_requires_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_forward_ad_runs_with_no_requires_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_input_layout2, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_input_layout4, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_output_shape_or_dtype_depend_on_values, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_test_outputs, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_gradcheck_validates_inputs, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_graph_save_on_cpu, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_hook_edge_case_when_called_with_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_hook_none, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_hooks_cpp, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_indexing, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_inplace_not_requires_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_inplace_on_view_backward, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_inplace_on_view_leaf_errors, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_inplace_on_view_weak_grad_fn, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_integer_outputs, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_legacy_function_deprecation_exception, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_lobpcg, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_mark_non_differentiable, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_materialize_grads, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_multi_backward, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_multi_backward_no_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_named_tensor_for_complex_views, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_naughty_anomaly_access, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_naughty_autograd_function_stashing_ctx, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_nested_anomaly_printstack_cleanup, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_next_functions, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_no_grad_python_function, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_no_requires_grad_inplace, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_no_unnecessary_save, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_not_implemented_fwad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_pickle, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_post_accumulate_grad_hook_gets_cleaned_up, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_post_accumulate_grad_hook_returns_not_None, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_pow_zero_tensor_gradient, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_power_function, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_prehook_ordering, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_profiler, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_profiler_aggregation_table, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_profiler_function_event_avg, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_profiler_seq_nr, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_profiler_shapes, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_record_function, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_reentrant_child_error, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_reentrant_with_callbacks_depth_0, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_reentrant_with_leaf_variable_hook, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_requires_grad_, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_retain_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_retain_grad_cycle, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_retains_grad_inplace_multiple_outputs, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_return_duplicate, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_return_duplicate_inplace, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_return_leaf, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_save_none_for_backward, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_save_on_cpu_and_checkpoint, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_save_output_nr, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_saved_tensor_hooks_custom_function_intermediates, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_saved_tensor_hooks_extra_enter_during_bw_no_leak, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_saved_variable_packing_unpacking_did_not_save_original_with_hooks, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_saved_variable_packing_unpacking_saved_original_with_default_hooks, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_saved_variable_version_counter, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_scalar_grad_mixed_device, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_select_expanded_v, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_set_data_tensorimpl_type, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_set_grad_coroutines_benign_exceptions, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_set_grad_enabled_wraps, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_set_grad_generator_functions, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_set_materialize_non_diff_grads, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_shape, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_sharded_grad, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_sparse_gather_both_scalar, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_sparse_gather_dim_neg, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_sparse_gather_ind_scalar, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_tensor_grad_warnings, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_tensor_hooks_inplace_multiple_outputs, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_thread_shutdown, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_too_many_grads, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_unrelated_inputs, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_unused_output, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_var_mean_differentiable, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_version_counter, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_view_func_replay_with_modified_state, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_volatile_deprecated, test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_will_engine_execute_node, test/inductor/test_compiled_autograd.py::TestNestedCheckpointWithCompiledAutograd::test_nested_checkpoint_early_stop_False, test/inductor/test_compiled_autograd.py::TestNestedCheckpointWithCompiledAutograd::test_nested_checkpoint_early_stop_True, test/inductor/test_compiled_autograd.py::TestNestedCheckpointWithCompiledAutograd::test_nested_checkpoint_kwargs_early_stop_True, test/inductor/test_compiled_autograd.py::TestNestedCheckpointWithCompiledAutograd::test_nested_checkpoint_non_tensor_inputs_and_outputs_early_stop_True, test/inductor/test_compiled_autograd.py::TestNestedCheckpointWithCompiledAutograd::test_nested_checkpoint_reentrant_backwards_early_stop_False, test/inductor/test_compiled_autograd.py::TestNestedCheckpointWithCompiledAutograd::test_nested_checkpoint_reentrant_backwards_early_stop_True, test/inductor/test_compiled_autograd.py::TestNestedCheckpointWithCompiledAutograd::test_nested_checkpoint_same_graph_early_stop_True, test/inductor/test_compiled_autograd.py::TestNestedCheckpointWithCompiledAutograd::test_nested_checkpoint_two_children_early_stop_False, test/inductor/test_compiled_autograd.py::TestNestedCheckpointWithCompiledAutograd::test_nested_checkpoint_two_children_early_stop_True, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_abstract_impl_on_existing_op, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_abstract_impl_on_existing_op_with_CompositeExplicitAutograd, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_backward_dict_grad_for_nontensor, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_backward_impl_on_existing_op_incorrect_schema_mutable, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_backward_impl_on_existing_op_incorrect_schema_no_output, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_backward_impl_on_existing_op_with_key_key_AutogradCUDA, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_backward_output_differentiability_tensorlist, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_backward_tensorlist_input_requires_list_grads_with_same_numel, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_basic_make_fx, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_data_dependent_basic, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_data_dependent_nms_dynamic_compile, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_defined_in_python, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_duplicate_impl, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_impl_abstract_overload, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_impl_device_cpu, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_impl_invalid_devices, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_impl_multiple, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_impl_on_existing_op_with_cpu_registration_key_CPU, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_impl_on_existing_op_with_cpu_registration_key_CompositeImplicitAutograd, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_impl_separate, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_infer_schema_supported, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_infer_schema_unsupported, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_invalid_qualname, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_invalid_schemas, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_is_functional_schema, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_is_tensorlist_like_type, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_legacy_define, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_legacy_impl, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_meta_for_data_dependent_shape_operation, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_name_must_match, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_new_data_dependent_symint, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_override_impl, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_override_meta, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_private_ctor, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_supported_param_types, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_symints, test/inductor/test_compiled_autograd.py::TestCustomOpWithCompiledAutograd::test_unsupported_schemas, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_allow_python_side_effects_utility, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_capture_constants, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_capture_input_num, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_capture_numpy_number, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_capture_tracked, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_capture_untracked_global_nested, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_cond_branches_no_arguments, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_cond_free_variable_in_both_branches, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_cond_graph_break_in_one_branch, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_cond_pytree_operands, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_cond_side_effect_in_one_branches, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_cond_source_fn_stack, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_cond_with_constant_pred, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_fallback_on_graph_break_simple, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_freevars_as_inputs_to_wrap, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_grad_source_fn_stack, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_hints_wrapper_no_hints, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_hopify_generic_wrap, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_internal_nonlocal, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_lift_tensors_with_compound_expressions, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_map_kwargs, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_map_lowers_to_graph, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_map_multi_return, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_map_pytree_return, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_map_source_fn_stack, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_map_subgraph_name_is_valid, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_nested_tuple_output, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_nested_wrap, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_no_freevars, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_output_with_dict, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_register_subclass, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_return_captured_var, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_return_captured_var_used_multiple_times, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_return_captured_vars, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_del_existing_attr_global_obj, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_del_existing_attr_nonlocal_obj, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_local_list_append_no_graph_break, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_mutate_global_list, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_mutate_global_num, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_mutate_global_num_builtin, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_mutate_global_tensor, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_mutate_nonlocal_num, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_mutate_nonlocal_num_builtin, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_mutate_nonlocal_tensor_builtin, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_nested_nonlocal_list_append_graph_break, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_nonlocal_list_append_graph_break, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_set_existing_attr_global_module, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_set_existing_attr_global_obj, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_set_existing_attr_nonlocal_module, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_side_effect_set_new_attr_global_module, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_symint_in_slice, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_unbacked_symbol_closure, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_vmap_multiply_scalar, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_vmap_source_fn_stack, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_wrap_allow_local_assign_in_body_fn, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_wrap_kwarg, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_wrap_kwarg_default_else_branch, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_wrap_kwarg_only, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_wrap_kwarg_recompile, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_wrap_pytree_kwargs, test/inductor/test_compiled_autograd.py::HigherOrderOpTestsWithCompiledAutograd::test_wrap_source_fn_stack, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_functional_call_sequential_params_and_buffers, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_grad_call_compiled_backward_fn, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_grad_call_torch_compile_fn, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_grad_fn_with_kwargs, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_grad_freevar_python_scalar, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_grad_freevar_tensor, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_grad_has_aux, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_grad_pytree, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_grad_recompile, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_grad_with_graph_break, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_grad_with_side_effect, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_hessian, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_hessian_argnums, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_jacfwd, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_jacfwd_has_aux, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_jacrev_has_aux, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_jacrev_two_tensors_argnums, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_jvp_call_torch_compile_fn, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_jvp_freevar_tensor, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_jvp_has_aux, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_jvp_simple, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_jvp_two_tensors_has_aux, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vjp_call_compiled_backward_fn, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vjp_multiple_outputs, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vjp_multiple_outputs_python_struct, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_call_torch_compile_fn, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_free_const, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_multiple_invocation_in_dims, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_multiple_invocation_out_dims, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_multiple_outputs_diff_dims, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_over_vmap_captured, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_pytree_inputs, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_recompile, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_recompile_different_config, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_recompile_same_config, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_side_effects, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_side_effects_append_input, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_two_inputs, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_two_inputs_tuple_in_dims, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_with_conditional_graph_break, test/inductor/test_compiled_autograd.py::FuncTorchHigherOrderOpTestsWithCompiledAutograd::test_vmap_with_graph_break, test/inductor/test_compiled_autograd.py::ActivationCheckpointingTestsWithCompiledAutograd::test_cond_with_invalid_kwargs, test/inductor/test_compiled_autograd.py::ActivationCheckpointingTestsWithCompiledAutograd::test_dropout_inductor, test/inductor/test_compiled_autograd.py::ActivationCheckpointingTestsWithCompiledAutograd::test_flop_counter_for_cond, test/inductor/test_compiled_autograd.py::ActivationCheckpointingTestsWithCompiledAutograd::test_flop_counter_for_cond_unbalanced_branches, test/inductor/test_compiled_autograd.py::ActivationCheckpointingTestsWithCompiledAutograd::test_function, test/inductor/test_compiled_autograd.py::ActivationCheckpointingTestsWithCompiledAutograd::test_module, test/inductor/test_compiled_autograd.py::ActivationCheckpointingTestsWithCompiledAutograd::test_non_aliasing_util, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_device_mesh_compile, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dtensor_basic_export, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dtensor_constructor_w_dynamo_disable, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dtensor_constructor_w_graph_break, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dtensor_different_gradient_placement, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dtensor_dont_recompile_on_same_placement_devicemesh, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dtensor_dynamic, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dtensor_dynamic_slice, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dtensor_dynamo_device_mesh_attrs, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dtensor_partial_placement_graph_output, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dtensor_partial_placement_redistribute_unbalanced_correct_strides, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dynamo_dtensor, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dynamo_dtensor_from_local_dynamic_shapes, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dynamo_dtensor_from_local_redistribute, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dynamo_dtensor_from_local_redistribute_async, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dynamo_dtensor_recompile, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dynamo_to_local_kwargs, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_dynamo_to_local_kwargs_forward_hook, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_fakify_dtensor, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_graph_input_is_async, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_placement_compile, test/inductor/test_compiled_autograd.py::TestDTensorCompileWithCompiledAutograd::test_unwrap_async_collective_tensor_tangent, test/inductor/test_compiled_autograd.py::TestCompiledAutogradOpInfoCUDA::test_hops_in_bwd_cond_simple_cuda_float32, test/inductor/test_compiled_autograd.py::TestCompiledAutogradOpInfoCUDA::test_hops_in_bwd_invoke_quant_packed_simple_cuda_float32, test/inductor/test_compiled_autograd.py::TestCompiledAutogradOpInfoCUDA::test_hops_in_bwd_invoke_subgraph_simple_cuda_float32, test/inductor/test_compiled_autograd.py::TestCompiledAutogradOpInfoCUDA::test_hops_in_bwd_map_nested_cuda_float32, test/inductor/test_compiled_autograd.py::TestCompiledAutogradOpInfoCUDA::test_hops_in_bwd_map_simple_cuda_float32, test/inductor/test_compiled_autograd.py::TestCompiledAutogradOpInfoCUDA::test_hops_in_bwd_while_loop_simple_cuda_float32 2025-08-14T22:28:32.5062254Z 2025-08-14T22:28:32.7823936Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:28:32.7825271Z import pkg_resources 2025-08-14T22:28:33.1560941Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:28:33.1562285Z import pkg_resources 2025-08-14T22:28:33.1713412Z Running test_flop_counter 1/1 ... [2025-08-14 22:28:33.170708] 2025-08-14T22:28:33.1713937Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:28:33.1715598Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_flop_counter.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:28:33.171095] 2025-08-14T22:28:33.5399255Z Running torch_np/test_nep50_examples 1/1 ... [2025-08-14 22:28:33.539355] 2025-08-14T22:28:33.5399700Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:28:33.5401063Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/test_nep50_examples.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:28:33.539771] 2025-08-14T22:28:36.7034421Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:28:36.7035758Z import pkg_resources 2025-08-14T22:28:37.0958134Z Running higher_order_ops/test_invoke_quant 1/1 ... [2025-08-14 22:28:37.095198] 2025-08-14T22:28:37.0958617Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:28:37.0959898Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'higher_order_ops/test_invoke_quant.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:28:37.095653] 2025-08-14T22:28:37.9539100Z 2025-08-14T22:28:37.9540375Z test_flop_counter 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_flop_counter_1.1_25bd73df2d211d23_.log 2025-08-14T22:28:37.9549724Z Running 22 items in this shard: test/test_flop_counter.py::TestFlopCounter::test_addmm_out, test/test_flop_counter.py::TestFlopCounter::test_autograd_op, test/test_flop_counter.py::TestFlopCounter::test_backward, test/test_flop_counter.py::TestFlopCounter::test_backward_reset, test/test_flop_counter.py::TestFlopCounter::test_conv_backwards_as_decomposition, test/test_flop_counter.py::TestFlopCounter::test_conv_transpose_loop, test/test_flop_counter.py::TestFlopCounter::test_convs, test/test_flop_counter.py::TestFlopCounter::test_custom, test/test_flop_counter.py::TestFlopCounter::test_custom_op, test/test_flop_counter.py::TestFlopCounter::test_flop_counter_variety, test/test_flop_counter.py::TestFlopCounter::test_hook_registration, test/test_flop_counter.py::TestFlopCounter::test_inference_mode, test/test_flop_counter.py::TestFlopCounter::test_module, test/test_flop_counter.py::TestFlopCounter::test_nested_attention_fake_tensors, test/test_flop_counter.py::TestFlopCounter::test_noop, test/test_flop_counter.py::TestFlopCounter::test_op, test/test_flop_counter.py::TestFlopCounter::test_pytrees, test/test_flop_counter.py::TestFlopCounter::test_scaled_mm, test/test_flop_counter.py::TestFlopCounter::test_sdpa, test/test_flop_counter.py::TestFlopCounter::test_sdpa_nested_tensor, test/test_flop_counter.py::TestFlopCounter::test_torchscript, test/test_flop_counter.py::TestFlopCounter::test_warning 2025-08-14T22:28:37.9558405Z 2025-08-14T22:28:39.8684473Z 2025-08-14T22:28:39.8685601Z torch_np/test_nep50_examples 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.test_nep50_examples_1.1_572621bd9f55a166_.log 2025-08-14T22:28:39.9300641Z Running 1573 items in this shard: test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_3j + array(3, complex64), test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_True + uint8(2), test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_array(1_0, float32) + 1e-14 == 1_0, test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_array([0_1], float32) == float64(0_1), test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_array([100], uint8) + 200, test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_array([1], uint8) + 1, test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_array([1], uint8) + 200, test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_array([1], uint8) + 300, test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_array([1], uint8) + array(1, int64), test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_array([1], uint8) + int64(1), test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_array([1_0], float32) + 1e-14 == 1_0, test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_array([1_], float32) + 3, test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_array([1_], float32) + array(1_, float64), test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_array([1_], float32) + float64(1_), test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_array([1_], float32) + int64(3), test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_bool_(True) + 1, test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_float32(1) + 1j, test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_float32(1) + 3e100, test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_float32(5) + 5j, test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_int16(2) + 2, test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_int16(4) + 4j, test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_int32(1) + 5j, test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_uint8(1) + 2, test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_uint8(1) + 300, test/torch_np/test_nep50_examples.py::TestNEP50Table::test_nep50_exceptions_example_uint8(100) + 200, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_add_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_arctan2_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_and_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_or_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_bitwise_xor_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_copysign_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divide_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_divmod_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_equal_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_float_power_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_floor_divide_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmax_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmin_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_fmod_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_gcd_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_equal_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_greater_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_heaviside_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_hypot_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_lcm_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_ldexp_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_left_shift_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_equal_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_less_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp2_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logaddexp_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_and_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_or_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_logical_xor_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_matmul_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_maximum_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_minimum_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_mod_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_modf_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_multiply_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_nextafter_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_not_equal_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_power_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_remainder_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_right_shift_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_subtract_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar27_array27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar28_array28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar29_array29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar30_array30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar31_array31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar32_array32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar33_array33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar34_array34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar35_array35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_1_array10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_1_array11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_1_array12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_1_array13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_1_array14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_1_array15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_1_array16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_1_array17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_1_array9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_2_0_array18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_2_0_array19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_2_0_array20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_2_0_array21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_2_0_array22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_2_0_array23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_2_0_array24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_2_0_array25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_2_0_array26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_True_array0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_True_array1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_True_array2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_True_array3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_True_array4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_True_array5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_True_array6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_True_array7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_compare_ufuncs_name_true_divide_scalar_True_array8, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar27_array27_dtype27, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar28_array28_dtype28, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar29_array29_dtype29, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar30_array30_dtype30, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar31_array31_dtype31, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar32_array32_dtype32, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar33_array33_dtype33, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar34_array34_dtype34, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar35_array35_dtype35, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_1_array10_dtype10, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_1_array11_dtype11, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_1_array12_dtype12, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_1_array13_dtype13, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_1_array14_dtype14, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_1_array15_dtype15, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_1_array16_dtype16, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_1_array17_dtype17, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_1_array9_dtype9, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_2_0_array18_dtype18, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_2_0_array19_dtype19, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_2_0_array20_dtype20, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_2_0_array21_dtype21, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_2_0_array22_dtype22, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_2_0_array23_dtype23, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_2_0_array24_dtype24, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_2_0_array25_dtype25, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_2_0_array26_dtype26, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_True_array0_dtype0, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_True_array1_dtype1, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_True_array2_dtype2, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_True_array3_dtype3, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_True_array4_dtype4, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_True_array5_dtype5, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_True_array6_dtype6, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_True_array7_dtype7, test/torch_np/test_nep50_examples.py::TestCompareToNumpy::test_direct_compare_scalar_True_array8_dtype8 2025-08-14T22:28:39.9888623Z 2025-08-14T22:28:42.1371250Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:28:42.1372582Z import pkg_resources 2025-08-14T22:28:42.5141501Z Running test_per_overload_api 1/1 ... [2025-08-14 22:28:42.513593] 2025-08-14T22:28:42.5141950Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:28:42.5144208Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_per_overload_api.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:28:42.514004] 2025-08-14T22:28:44.0438398Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:28:44.0439719Z import pkg_resources 2025-08-14T22:28:44.4253849Z Running test_tensorexpr_pybind 1/1 ... [2025-08-14 22:28:44.424864] 2025-08-14T22:28:44.4254285Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:28:44.4254791Z 2025-08-14T22:28:44.4255455Z higher_order_ops/test_invoke_quant 1/1 was successful, full logs can be found in artifacts with path test/test-reports/higher_order_ops.test_invoke_quant_1.1_c2763aa20ec90f73_.log 2025-08-14T22:28:44.4257063Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_tensorexpr_pybind.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:28:44.425250] 2025-08-14T22:28:44.4262703Z Running 14 items in this shard: test/higher_order_ops/test_invoke_quant.py::TestInvokeQuantEager::test_construct_inline, test/higher_order_ops/test_invoke_quant.py::TestInvokeQuantEager::test_inline, test/higher_order_ops/test_invoke_quant.py::TestInvokeQuantEager::test_multiple, test/higher_order_ops/test_invoke_quant.py::TestInvokeQuantEager::test_simple, test/higher_order_ops/test_invoke_quant.py::TestInvokeQuantAotEager::test_construct_inline, test/higher_order_ops/test_invoke_quant.py::TestInvokeQuantAotEager::test_inline, test/higher_order_ops/test_invoke_quant.py::TestInvokeQuantAotEager::test_multiple, test/higher_order_ops/test_invoke_quant.py::TestInvokeQuantAotEager::test_simple, test/higher_order_ops/test_invoke_quant.py::TestInvokeQuantInductor::test_construct_inline, test/higher_order_ops/test_invoke_quant.py::TestInvokeQuantInductor::test_inline, test/higher_order_ops/test_invoke_quant.py::TestInvokeQuantInductor::test_multiple, test/higher_order_ops/test_invoke_quant.py::TestInvokeQuantInductor::test_pattern_matching, test/higher_order_ops/test_invoke_quant.py::TestInvokeQuantInductor::test_prologue, test/higher_order_ops/test_invoke_quant.py::TestInvokeQuantInductor::test_simple 2025-08-14T22:28:44.4267774Z 2025-08-14T22:28:47.1380612Z 2025-08-14T22:28:47.1382186Z test_per_overload_api 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_per_overload_api_1.1_5de49593a8d9d53b_.log 2025-08-14T22:28:47.1384402Z Running 3 items in this shard: test/test_per_overload_api.py::TestPerOverloadAPI::test_basics_opoverload, test/test_per_overload_api.py::TestPerOverloadAPI::test_basics_opoverloadpacket, test/test_per_overload_api.py::TestPerOverloadAPI::test_decompose 2025-08-14T22:28:47.1385392Z 2025-08-14T22:28:48.6203442Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:28:48.6205293Z import pkg_resources 2025-08-14T22:28:48.9998543Z 2025-08-14T22:28:48.9999978Z test_tensorexpr_pybind 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_tensorexpr_pybind_1.1_b5ffeda4e3be9d9d_.log 2025-08-14T22:28:49.0008960Z Running 17 items in this shard: test/test_tensorexpr_pybind.py::TestTensorExprPyBind::test_alloc_in_loop, test/test_tensorexpr_pybind.py::TestTensorExprPyBind::test_call_raw, test/test_tensorexpr_pybind.py::TestTensorExprPyBind::test_dtype_error, test/test_tensorexpr_pybind.py::TestTensorExprPyBind::test_dynamic_shape, test/test_tensorexpr_pybind.py::TestTensorExprPyBind::test_dynamic_shape_2d, test/test_tensorexpr_pybind.py::TestTensorExprPyBind::test_external_calls, test/test_tensorexpr_pybind.py::TestTensorExprPyBind::test_kernel_shape_prop, test/test_tensorexpr_pybind.py::TestTensorExprPyBind::test_kernel_shape_prop_module, test/test_tensorexpr_pybind.py::TestTensorExprPyBind::test_kernel_with_custom_lowering, test/test_tensorexpr_pybind.py::TestTensorExprPyBind::test_kernel_with_expand, test/test_tensorexpr_pybind.py::TestTensorExprPyBind::test_kernel_with_permute, test/test_tensorexpr_pybind.py::TestTensorExprPyBind::test_kernel_with_scalar_inputs, test/test_tensorexpr_pybind.py::TestTensorExprPyBind::test_kernel_with_t, test/test_tensorexpr_pybind.py::TestTensorExprPyBind::test_kernel_with_tensor_inputs, test/test_tensorexpr_pybind.py::TestTensorExprPyBind::test_kernel_with_transpose, test/test_tensorexpr_pybind.py::TestTensorExprPyBind::test_simple_sum, test/test_tensorexpr_pybind.py::TestExprHandlePyBind::test_unary_ops 2025-08-14T22:28:49.0017852Z 2025-08-14T22:28:49.0137815Z Running test_pytree 1/1 ... [2025-08-14 22:28:49.013197] 2025-08-14T22:28:49.0138382Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:28:49.0139824Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_pytree.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:28:49.013607] 2025-08-14T22:28:51.2790200Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:28:51.2792416Z import pkg_resources 2025-08-14T22:28:51.6545405Z Running test_fx_passes 1/1 ... [2025-08-14 22:28:51.653977] 2025-08-14T22:28:51.6545831Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:28:51.6547661Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_fx_passes.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:28:51.654399] 2025-08-14T22:28:53.1376639Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:28:53.1378240Z import pkg_resources 2025-08-14T22:28:53.5318490Z Running test_module_tracker 1/1 ... [2025-08-14 22:28:53.531392] 2025-08-14T22:28:53.5319070Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:28:53.5322218Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_module_tracker.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:28:53.531808] 2025-08-14T22:28:53.7381028Z 2025-08-14T22:28:53.7381791Z test_pytree 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_pytree_1.1_b9f85386d888d755_.log 2025-08-14T22:28:53.7406156Z Running 98 items in this shard: test/test_pytree.py::TestGenericPytree::test_aligned_public_apis, test/test_pytree.py::TestGenericPytree::test_broadcast_to_and_flatten_cxx, test/test_pytree.py::TestGenericPytree::test_broadcast_to_and_flatten_py, test/test_pytree.py::TestGenericPytree::test_enum_treespec_roundtrip_cxx, test/test_pytree.py::TestGenericPytree::test_enum_treespec_roundtrip_py, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_defaultdict_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_defaultdict_py, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_deque_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_deque_py, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_dict_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_dict_py, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_leaf_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_leaf_py, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_list_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_list_py, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_namedtuple_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_namedtuple_py, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_nested_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_nested_py, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_ordereddict_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_ordereddict_py, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_return_types_max_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_return_types_max_py, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_return_types_min_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_return_types_min_py, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_tuple_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_tuple_py, test/test_pytree.py::TestGenericPytree::test_flatten_with_is_leaf_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_with_is_leaf_py, test/test_pytree.py::TestGenericPytree::test_is_namedtuple_cxx, test/test_pytree.py::TestGenericPytree::test_is_namedtuple_py, test/test_pytree.py::TestGenericPytree::test_is_structseq_cxx, test/test_pytree.py::TestGenericPytree::test_is_structseq_py, test/test_pytree.py::TestGenericPytree::test_pytree_serialize_bad_input_cxx, test/test_pytree.py::TestGenericPytree::test_pytree_serialize_bad_input_py, test/test_pytree.py::TestGenericPytree::test_register_pytree_node_cxx, test/test_pytree.py::TestGenericPytree::test_register_pytree_node_py, test/test_pytree.py::TestGenericPytree::test_tree_all_any_cxx, test/test_pytree.py::TestGenericPytree::test_tree_all_any_py, test/test_pytree.py::TestGenericPytree::test_tree_map_cxx, test/test_pytree.py::TestGenericPytree::test_tree_map_multi_inputs_cxx, test/test_pytree.py::TestGenericPytree::test_tree_map_multi_inputs_py, test/test_pytree.py::TestGenericPytree::test_tree_map_only_cxx, test/test_pytree.py::TestGenericPytree::test_tree_map_only_predicate_fn_cxx, test/test_pytree.py::TestGenericPytree::test_tree_map_only_predicate_fn_py, test/test_pytree.py::TestGenericPytree::test_tree_map_only_py, test/test_pytree.py::TestGenericPytree::test_tree_map_py, test/test_pytree.py::TestPythonPytree::test_constant, test/test_pytree.py::TestPythonPytree::test_constant_default_eq_error, test/test_pytree.py::TestPythonPytree::test_constant_default_hash_error, test/test_pytree.py::TestPythonPytree::test_dataclass, test/test_pytree.py::TestPythonPytree::test_deprecated_register_pytree_node, test/test_pytree.py::TestPythonPytree::test_flatten_flatten_with_key_consistency, test/test_pytree.py::TestPythonPytree::test_import_pytree_doesnt_import_optree, test/test_pytree.py::TestPythonPytree::test_key_access, test/test_pytree.py::TestPythonPytree::test_key_str, test/test_pytree.py::TestPythonPytree::test_pytree_context_serialize_bad, test/test_pytree.py::TestPythonPytree::test_pytree_custom_type_serialize, test/test_pytree.py::TestPythonPytree::test_pytree_custom_type_serialize_bad, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_bad_protocol, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_defaultdict_enum, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_enum, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_namedtuple, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_namedtuple_bad, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_register_bad, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_spec0, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_spec1, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_spec2, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_spec3, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_spec4, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_spec5, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_spec6, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_spec7, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_spec8, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_spec9, test/test_pytree.py::TestPythonPytree::test_register_dataclass_class, test/test_pytree.py::TestPythonPytree::test_saved_serialized, test/test_pytree.py::TestPythonPytree::test_tree_flatten_with_path_is_leaf, test/test_pytree.py::TestPythonPytree::test_tree_flatten_with_path_roundtrip, test/test_pytree.py::TestPythonPytree::test_tree_leaves_with_path, test/test_pytree.py::TestPythonPytree::test_tree_map_with_path, test/test_pytree.py::TestPythonPytree::test_tree_map_with_path_multiple_trees, test/test_pytree.py::TestPythonPytree::test_treespec_equality, test/test_pytree.py::TestPythonPytree::test_treespec_repr, test/test_pytree.py::TestCxxPytree::test_pytree_custom_type_serialize, test/test_pytree.py::TestCxxPytree::test_pytree_serialize_namedtuple, test/test_pytree.py::TestCxxPytree::test_pytree_serialize_spec0, test/test_pytree.py::TestCxxPytree::test_pytree_serialize_spec1, test/test_pytree.py::TestCxxPytree::test_pytree_serialize_spec2, test/test_pytree.py::TestCxxPytree::test_pytree_serialize_spec3, test/test_pytree.py::TestCxxPytree::test_pytree_serialize_spec4, test/test_pytree.py::TestCxxPytree::test_pytree_serialize_spec5, test/test_pytree.py::TestCxxPytree::test_pytree_serialize_spec6, test/test_pytree.py::TestCxxPytree::test_pytree_serialize_spec7, test/test_pytree.py::TestCxxPytree::test_pytree_serialize_spec8, test/test_pytree.py::TestCxxPytree::test_pytree_serialize_spec9, test/test_pytree.py::TestCxxPytree::test_treespec_equality, test/test_pytree.py::TestCxxPytree::test_treespec_repr 2025-08-14T22:28:53.7429683Z 2025-08-14T22:28:56.3290919Z 2025-08-14T22:28:56.3291873Z test_fx_passes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_fx_passes_1.1_c6b0378745071a6a_.log 2025-08-14T22:28:56.3315260Z Running 53 items in this shard: test/test_fx_passes.py::TestFXGraphPasses::test_fuser_pass_deep_model, test/test_fx_passes.py::TestFXGraphPasses::test_fuser_util_partition0, test/test_fx_passes.py::TestFXGraphPasses::test_fuser_util_partition1, test/test_fx_passes.py::TestFXGraphPasses::test_fuser_util_partition10, test/test_fx_passes.py::TestFXGraphPasses::test_fuser_util_partition11, test/test_fx_passes.py::TestFXGraphPasses::test_fuser_util_partition2, test/test_fx_passes.py::TestFXGraphPasses::test_fuser_util_partition3, test/test_fx_passes.py::TestFXGraphPasses::test_fuser_util_partition4, test/test_fx_passes.py::TestFXGraphPasses::test_fuser_util_partition5, test/test_fx_passes.py::TestFXGraphPasses::test_fuser_util_partition6, test/test_fx_passes.py::TestFXGraphPasses::test_fuser_util_partition7, test/test_fx_passes.py::TestFXGraphPasses::test_fuser_util_partition8, test/test_fx_passes.py::TestFXGraphPasses::test_fuser_util_partition9, test/test_fx_passes.py::TestFXGraphPasses::test_fuser_util_xfail_partition0, test/test_fx_passes.py::TestFXGraphPasses::test_fuser_util_xfail_partition1, test/test_fx_passes.py::TestFXGraphPasses::test_fuser_util_xfail_partition2, test/test_fx_passes.py::TestFXGraphPasses::test_fuser_util_xfail_partition3, test/test_fx_passes.py::TestFXGraphPasses::test_partitioner_fn0_expected_partition0_bookend_non_compute_pass_False, test/test_fx_passes.py::TestFXGraphPasses::test_partitioner_fn10_expected_partition10_bookend_non_compute_pass_False, test/test_fx_passes.py::TestFXGraphPasses::test_partitioner_fn11_expected_partition11_bookend_non_compute_pass_False, test/test_fx_passes.py::TestFXGraphPasses::test_partitioner_fn12_expected_partition12_bookend_non_compute_pass_False, test/test_fx_passes.py::TestFXGraphPasses::test_partitioner_fn13_expected_partition13_bookend_non_compute_pass_False, test/test_fx_passes.py::TestFXGraphPasses::test_partitioner_fn14_expected_partition14_bookend_non_compute_pass_True, test/test_fx_passes.py::TestFXGraphPasses::test_partitioner_fn15_expected_partition15_bookend_non_compute_pass_False, test/test_fx_passes.py::TestFXGraphPasses::test_partitioner_fn16_expected_partition16_bookend_non_compute_pass_True, test/test_fx_passes.py::TestFXGraphPasses::test_partitioner_fn17_expected_partition17_bookend_non_compute_pass_False, test/test_fx_passes.py::TestFXGraphPasses::test_partitioner_fn18_expected_partition18_bookend_non_compute_pass_False, test/test_fx_passes.py::TestFXGraphPasses::test_partitioner_fn1_expected_partition1_bookend_non_compute_pass_False, test/test_fx_passes.py::TestFXGraphPasses::test_partitioner_fn2_expected_partition2_bookend_non_compute_pass_False, test/test_fx_passes.py::TestFXGraphPasses::test_partitioner_fn3_expected_partition3_bookend_non_compute_pass_False, test/test_fx_passes.py::TestFXGraphPasses::test_partitioner_fn4_expected_partition4_bookend_non_compute_pass_False, test/test_fx_passes.py::TestFXGraphPasses::test_partitioner_fn5_expected_partition5_bookend_non_compute_pass_False, test/test_fx_passes.py::TestFXGraphPasses::test_partitioner_fn6_expected_partition6_bookend_non_compute_pass_False, test/test_fx_passes.py::TestFXGraphPasses::test_partitioner_fn7_expected_partition7_bookend_non_compute_pass_False, test/test_fx_passes.py::TestFXGraphPasses::test_partitioner_fn8_expected_partition8_bookend_non_compute_pass_False, test/test_fx_passes.py::TestFXGraphPasses::test_partitioner_fn9_expected_partition9_bookend_non_compute_pass_False, test/test_fx_passes.py::TestFXGraphPasses::test_partitioner_independent_output_fn0_expected_partition0, test/test_fx_passes.py::TestFXMatcherUtils::test_subgraph_matcher_test_model0, test/test_fx_passes.py::TestFXMatcherUtils::test_subgraph_matcher_test_model1, test/test_fx_passes.py::TestFXMatcherUtils::test_subgraph_matcher_test_model10, test/test_fx_passes.py::TestFXMatcherUtils::test_subgraph_matcher_test_model11, test/test_fx_passes.py::TestFXMatcherUtils::test_subgraph_matcher_test_model12, test/test_fx_passes.py::TestFXMatcherUtils::test_subgraph_matcher_test_model13, test/test_fx_passes.py::TestFXMatcherUtils::test_subgraph_matcher_test_model14, test/test_fx_passes.py::TestFXMatcherUtils::test_subgraph_matcher_test_model15, test/test_fx_passes.py::TestFXMatcherUtils::test_subgraph_matcher_test_model2, test/test_fx_passes.py::TestFXMatcherUtils::test_subgraph_matcher_test_model3, test/test_fx_passes.py::TestFXMatcherUtils::test_subgraph_matcher_test_model4, test/test_fx_passes.py::TestFXMatcherUtils::test_subgraph_matcher_test_model5, test/test_fx_passes.py::TestFXMatcherUtils::test_subgraph_matcher_test_model6, test/test_fx_passes.py::TestFXMatcherUtils::test_subgraph_matcher_test_model7, test/test_fx_passes.py::TestFXMatcherUtils::test_subgraph_matcher_test_model8, test/test_fx_passes.py::TestFXMatcherUtils::test_subgraph_matcher_test_model9 2025-08-14T22:28:56.3344808Z 2025-08-14T22:28:57.8188385Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:28:57.8190177Z import pkg_resources 2025-08-14T22:28:58.1563565Z 2025-08-14T22:28:58.1564310Z test_module_tracker 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_module_tracker_1.1_b53cd071b9cf6b23_.log 2025-08-14T22:28:58.1565835Z Running 3 items in this shard: test/test_module_tracker.py::TestModuleTracker::test_bw_detection, test/test_module_tracker.py::TestModuleTracker::test_confused_hierarchy, test/test_module_tracker.py::TestModuleTracker::test_module_hierarchy 2025-08-14T22:28:58.1566741Z 2025-08-14T22:28:58.2169398Z Running test_typing 1/1 ... [2025-08-14 22:28:58.216444] 2025-08-14T22:28:58.2169769Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:28:58.2172099Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_typing.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:28:58.216853] 2025-08-14T22:29:00.4177819Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:29:00.4179226Z import pkg_resources 2025-08-14T22:29:00.8172337Z Running test_openmp 1/1 ... [2025-08-14 22:29:00.816746] 2025-08-14T22:29:00.8172725Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:29:00.8174336Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_openmp.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:29:00.817144] 2025-08-14T22:29:02.2679589Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:29:02.2681825Z import pkg_resources 2025-08-14T22:29:02.6420687Z Running torch_np/numpy_tests/core/test_scalarinherit 1/1 ... [2025-08-14 22:29:02.641572] 2025-08-14T22:29:02.6421217Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:29:02.6424827Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_scalarinherit.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:29:02.641954] 2025-08-14T22:29:05.3911225Z 2025-08-14T22:29:05.3912097Z test_openmp 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_openmp_1.1_4ed83f0c682194e2_.log 2025-08-14T22:29:05.3913258Z Running 2 items in this shard: test/test_openmp.py::TestOpenMP_ParallelFor::test_n_threads, test/test_openmp.py::TestOpenMP_ParallelFor::test_one_thread 2025-08-14T22:29:05.3913834Z 2025-08-14T22:29:07.0154012Z 2025-08-14T22:29:07.0155959Z torch_np/numpy_tests/core/test_scalarinherit 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_scalarinherit_1.1_abff485e32153282_.log 2025-08-14T22:29:07.0158469Z Running 3 items in this shard: test/torch_np/numpy_tests/core/test_scalarinherit.py::TestInherit::test_gh_15395, test/torch_np/numpy_tests/core/test_scalarinherit.py::TestInherit::test_init, test/torch_np/numpy_tests/core/test_scalarinherit.py::TestInherit::test_init2 2025-08-14T22:29:07.0159483Z 2025-08-14T22:29:09.4849424Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:29:09.4850798Z import pkg_resources 2025-08-14T22:29:09.8604996Z Running functorch/test_ac_knapsack 1/1 ... [2025-08-14 22:29:09.860012] 2025-08-14T22:29:09.8605475Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:29:09.8608001Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_ac_knapsack.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:29:09.860418] 2025-08-14T22:29:11.1541272Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:29:11.1542596Z import pkg_resources 2025-08-14T22:29:11.5352080Z Running backends/xeon/test_launch 1/1 ... [2025-08-14 22:29:11.534696] 2025-08-14T22:29:11.5352518Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:29:11.5356775Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'backends/xeon/test_launch.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:29:11.535137] 2025-08-14T22:29:14.4353636Z 2025-08-14T22:29:14.4354723Z functorch/test_ac_knapsack 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_ac_knapsack_1.1_f5f4022801113cf5_.log 2025-08-14T22:29:14.4361008Z Running 15 items in this shard: test/functorch/test_ac_knapsack.py::TestGraphInfoProvider::test_full_joint_nx_graph, test/functorch/test_ac_knapsack.py::TestGraphInfoProvider::test_get_knapsack_memory_input, test/functorch/test_ac_knapsack.py::TestGraphInfoProvider::test_get_knapsack_runtime_input, test/functorch/test_ac_knapsack.py::TestGraphInfoProvider::test_get_non_ac_peak_memory, test/functorch/test_ac_knapsack.py::TestGraphInfoProvider::test_get_theoretical_max_runtime, test/functorch/test_ac_knapsack.py::TestGraphInfoProvider::test_inialize_from_graph, test/functorch/test_ac_knapsack.py::TestGraphInfoProvider::test_recomputable_node_only_graph, test/functorch/test_ac_knapsack.py::TestGraphInfoProvider::test_recomputable_node_only_graph_with_larger_graph_context, test/functorch/test_ac_knapsack.py::TestGraphInfoProvider::test_simplified_fx_joint_graph, test/functorch/test_ac_knapsack.py::TestKnapsackEvaluator::test_evaluate_distribution_of_results_for_knapsack_algo, test/functorch/test_ac_knapsack.py::TestKnapsackEvaluator::test_evaluate_knapsack_output_accounting_for_backward_pass, test/functorch/test_ac_knapsack.py::TestKnapsackEvaluator::test_evaluate_knapsack_output_not_accounting_for_backward_pass, test/functorch/test_ac_knapsack.py::TestKnapsackEvaluator::test_evaluate_knapsack_output_with_wrong_sized_values, test/functorch/test_ac_knapsack.py::TestKnapsackEvaluator::test_get_backward_memory_from_topologically_sorted_graph, test/functorch/test_ac_knapsack.py::TestKnapsackEvaluator::test_get_knee_point_memory_budget 2025-08-14T22:29:14.4367212Z 2025-08-14T22:29:17.8629775Z 2025-08-14T22:29:17.8631113Z backends/xeon/test_launch 1/1 was successful, full logs can be found in artifacts with path test/test-reports/backends.xeon.test_launch_1.1_c9f7df9da8f60944_.log 2025-08-14T22:29:17.8632486Z Running 2 items in this shard: test/backends/xeon/test_launch.py::TestTorchrun::test_cpu_info, test/backends/xeon/test_launch.py::TestTorchrun::test_multi_threads 2025-08-14T22:29:17.8633273Z 2025-08-14T22:29:18.5226529Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:29:18.5227869Z import pkg_resources 2025-08-14T22:29:18.9032362Z Running test_nestedtensor 1/1 ... [2025-08-14 22:29:18.902534] 2025-08-14T22:29:18.9032937Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:29:18.9034729Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_nestedtensor.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:29:18.902962] 2025-08-14T22:29:21.9435055Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:29:21.9436371Z import pkg_resources 2025-08-14T22:29:22.3302728Z Running test_namedtensor 1/1 ... [2025-08-14 22:29:22.329819] 2025-08-14T22:29:22.3303238Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:29:22.3305965Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_namedtensor.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:29:22.330218] 2025-08-14T22:29:27.0542373Z 2025-08-14T22:29:27.0543277Z test_namedtensor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_namedtensor_1.1_36a38d250cecc506_.log 2025-08-14T22:29:27.0565418Z Running 88 items in this shard: test/test_namedtensor.py::TestNamedTensor::test_aaa_must_run_first_check_experimental_warning, test/test_namedtensor.py::TestNamedTensor::test_addcmul_addcdiv, test/test_namedtensor.py::TestNamedTensor::test_addmm, test/test_namedtensor.py::TestNamedTensor::test_addmv, test/test_namedtensor.py::TestNamedTensor::test_align_as, test/test_namedtensor.py::TestNamedTensor::test_align_tensors, test/test_namedtensor.py::TestNamedTensor::test_align_tensors_two_inputs, test/test_namedtensor.py::TestNamedTensor::test_align_to, test/test_namedtensor.py::TestNamedTensor::test_align_to_ellipsis, test/test_namedtensor.py::TestNamedTensor::test_any_all, test/test_namedtensor.py::TestNamedTensor::test_as_strided, test/test_namedtensor.py::TestNamedTensor::test_as_strided_cuda, test/test_namedtensor.py::TestNamedTensor::test_autograd_ignores_names, test/test_namedtensor.py::TestNamedTensor::test_autograd_smoke, test/test_namedtensor.py::TestNamedTensor::test_autograd_warns_named_grad, test/test_namedtensor.py::TestNamedTensor::test_bernoulli, test/test_namedtensor.py::TestNamedTensor::test_big_tensor_repr_has_names, test/test_namedtensor.py::TestNamedTensor::test_binary_ops, test/test_namedtensor.py::TestNamedTensor::test_bitwise_not, test/test_namedtensor.py::TestNamedTensor::test_bmm, test/test_namedtensor.py::TestNamedTensor::test_cat, test/test_namedtensor.py::TestNamedTensor::test_cdist, test/test_namedtensor.py::TestNamedTensor::test_comparison_ops, test/test_namedtensor.py::TestNamedTensor::test_copy_transpose, test/test_namedtensor.py::TestNamedTensor::test_cummax_cummin, test/test_namedtensor.py::TestNamedTensor::test_detach, test/test_namedtensor.py::TestNamedTensor::test_diagonal, test/test_namedtensor.py::TestNamedTensor::test_dot, test/test_namedtensor.py::TestNamedTensor::test_equal, test/test_namedtensor.py::TestNamedTensor::test_expand, test/test_namedtensor.py::TestNamedTensor::test_factory_coverage, test/test_namedtensor.py::TestNamedTensor::test_factory_edge_cases, test/test_namedtensor.py::TestNamedTensor::test_flatten, test/test_namedtensor.py::TestNamedTensor::test_flatten_index_error, test/test_namedtensor.py::TestNamedTensor::test_flatten_nodims, test/test_namedtensor.py::TestNamedTensor::test_has_names, test/test_namedtensor.py::TestNamedTensor::test_index_fill, test/test_namedtensor.py::TestNamedTensor::test_info_smoke, test/test_namedtensor.py::TestNamedTensor::test_logcumsumexp, test/test_namedtensor.py::TestNamedTensor::test_logical_not, test/test_namedtensor.py::TestNamedTensor::test_logical_ops, test/test_namedtensor.py::TestNamedTensor::test_masked_fill, test/test_namedtensor.py::TestNamedTensor::test_masked_select, test/test_namedtensor.py::TestNamedTensor::test_matmul, test/test_namedtensor.py::TestNamedTensor::test_max_pooling, test/test_namedtensor.py::TestNamedTensor::test_max_pooling_without_names_does_not_warn, test/test_namedtensor.py::TestNamedTensor::test_mm, test/test_namedtensor.py::TestNamedTensor::test_mv, test/test_namedtensor.py::TestNamedTensor::test_no_jit_script_support, test/test_namedtensor.py::TestNamedTensor::test_no_jit_tracer_support, test/test_namedtensor.py::TestNamedTensor::test_no_multiprocessing_support, test/test_namedtensor.py::TestNamedTensor::test_no_pickle_support, test/test_namedtensor.py::TestNamedTensor::test_no_save_support, test/test_namedtensor.py::TestNamedTensor::test_noncontig_contiguous, test/test_namedtensor.py::TestNamedTensor::test_none_names_refcount, test/test_namedtensor.py::TestNamedTensor::test_nyi_dimname_overload_msg, test/test_namedtensor.py::TestNamedTensor::test_out_fn_semantics, test/test_namedtensor.py::TestNamedTensor::test_pow_special, test/test_namedtensor.py::TestNamedTensor::test_py3_ellipsis, test/test_namedtensor.py::TestNamedTensor::test_reduction_fns, test/test_namedtensor.py::TestNamedTensor::test_refine_names, test/test_namedtensor.py::TestNamedTensor::test_rename, test/test_namedtensor.py::TestNamedTensor::test_rename_, test/test_namedtensor.py::TestNamedTensor::test_rename_globber, test/test_namedtensor.py::TestNamedTensor::test_rename_rename_map, test/test_namedtensor.py::TestNamedTensor::test_repr, test/test_namedtensor.py::TestNamedTensor::test_resize, test/test_namedtensor.py::TestNamedTensor::test_select, test/test_namedtensor.py::TestNamedTensor::test_select_cuda, test/test_namedtensor.py::TestNamedTensor::test_set_names_property, test/test_namedtensor.py::TestNamedTensor::test_size, test/test_namedtensor.py::TestNamedTensor::test_split_fns_propagates_names, test/test_namedtensor.py::TestNamedTensor::test_squeeze, test/test_namedtensor.py::TestNamedTensor::test_stride, test/test_namedtensor.py::TestNamedTensor::test_support_device_named_grad, test/test_namedtensor.py::TestNamedTensor::test_tensor_from_lists, test/test_namedtensor.py::TestNamedTensor::test_tensor_from_named_tensor, test/test_namedtensor.py::TestNamedTensor::test_tensor_from_numpy, test/test_namedtensor.py::TestNamedTensor::test_tensor_from_tensor, test/test_namedtensor.py::TestNamedTensor::test_tensor_grad_is_unnamed, test/test_namedtensor.py::TestNamedTensor::test_transpose_variants, test/test_namedtensor.py::TestNamedTensor::test_trivial, test/test_namedtensor.py::TestNamedTensor::test_unary_propagate_names_fns, test/test_namedtensor.py::TestNamedTensor::test_unflatten, test/test_namedtensor.py::TestNamedTensor::test_unsupported_op_error_msg, test/test_namedtensor.py::TestNamedTensor::test_using_seen_interned_string_doesnt_bump_refcount, test/test_namedtensor.py::TestNamedTensor::test_using_unseen_interned_string_bumps_refcount_permanently, test/test_namedtensor.py::TestNamedTensor::test_using_unseen_uninterned_string_refcounts 2025-08-14T22:29:27.0585406Z 2025-08-14T22:29:27.9360267Z 2025-08-14T22:29:27.9361276Z test_nestedtensor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_nestedtensor_1.1_222557c6a035edc5_.log 2025-08-14T22:29:28.0038044Z Running 1590 items in this shard: test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_2_max_seq_len_3_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_2_max_seq_len_3_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_2_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_2_max_seq_len_5_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_4_max_seq_len_3_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_4_max_seq_len_3_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_4_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_4_max_seq_len_5_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_2_max_seq_len_3_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_2_max_seq_len_3_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_2_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_2_max_seq_len_5_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_4_max_seq_len_3_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_4_max_seq_len_3_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_4_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_4_max_seq_len_5_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_2_max_seq_len_3_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_2_max_seq_len_3_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_2_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_2_max_seq_len_5_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_4_max_seq_len_3_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_4_max_seq_len_3_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_4_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_4_max_seq_len_5_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_cat, test/test_nestedtensor.py::TestNestedTensor::test_copy_, test/test_nestedtensor.py::TestNestedTensor::test_default_nested_tensor, test/test_nestedtensor.py::TestNestedTensor::test_dim, test/test_nestedtensor.py::TestNestedTensor::test_fill_, test/test_nestedtensor.py::TestNestedTensor::test_is_contiguous, test/test_nestedtensor.py::TestNestedTensor::test_like_functions_ones_like, test/test_nestedtensor.py::TestNestedTensor::test_like_functions_randn_like, test/test_nestedtensor.py::TestNestedTensor::test_like_functions_zeros_like, test/test_nestedtensor.py::TestNestedTensor::test_nested_namespace, test/test_nestedtensor.py::TestNestedTensor::test_nested_tensor, test/test_nestedtensor.py::TestNestedTensor::test_nested_tensor_matching_dim, test/test_nestedtensor.py::TestNestedTensor::test_nested_view_from_buffer_overflow_errors, test/test_nestedtensor.py::TestNestedTensor::test_numel, test/test_nestedtensor.py::TestNestedTensor::test_repr_string, test/test_nestedtensor.py::TestNestedTensor::test_size, test/test_nestedtensor.py::TestNestedTensor::test_size_dim, test/test_nestedtensor.py::TestNestedTensor::test_stride, test/test_nestedtensor.py::TestNestedTensor::test_to, test/test_nestedtensor.py::TestNestedTensor::test_to_padded_tensor_on_empty_tensor, test/test_nestedtensor.py::TestNestedTensor::test_unbind_0, test/test_nestedtensor.py::TestNestedTensor::test_unbind_1, test/test_nestedtensor.py::TestNestedTensor::test_unbind_3, test/test_nestedtensor.py::TestNestedTensor::test_unbind_4, test/test_nestedtensor.py::TestNestedTensor::test_unbind_dim, test/test_nestedtensor.py::TestNestedTensor::test_zero_, test/test_nestedtensor.py::TestNestedInt::test_comparisons, test/test_nestedtensor.py::TestNestedInt::test_with_factor, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_binary_ops_with_scalar_eq_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_binary_ops_with_scalar_ge_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cpu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cpu_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cuda_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cuda_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cuda_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cuda_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_clone_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_clone_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_contiguous_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_contiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_detach_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_detach_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_detach_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_device_checks_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_jagged_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_jagged_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_strided_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_strided_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_embedding_jagged_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_embedding_strided_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_empty_like_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_empty_like_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_empty_like_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_layer_norm_breaking_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_layer_norm_breaking_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_layer_norm_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_layer_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_linear_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_linear_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_linear_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_linear_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_masked_fill_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_masked_fill_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_masked_fill_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_nt_with_broadcasted_t_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_nt_with_broadcasted_t_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_with_bmm_path_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_with_bmm_path_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_narrow_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_narrow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_narrow_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_masked_select_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_in_place_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_in_place_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_transpose_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_transpose_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_transpose_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_transpose_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_chunk_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_chunk_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_chunk_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_128_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_128_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_256_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_256_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_384_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_384_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_8_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_8_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_div_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_div_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_noncontiguous_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_mul_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_mul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_mul_in_place_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_mul_in_place_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_split_with_sizes_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_split_with_sizes_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_split_with_sizes_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_sub_transpose_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_sub_transpose_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_sub_transpose_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_sub_transpose_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_sum_dim_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_reshape_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_reshape_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_reshape_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_scaled_dot_product_attention_input_dim_3_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_scaled_dot_product_attention_input_dim_4_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_serialization_requires_grad_False_weights_only_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_serialization_requires_grad_False_weights_only_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_serialization_requires_grad_False_weights_only_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_serialization_requires_grad_False_weights_only_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_serialization_requires_grad_False_weights_only_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_serialization_requires_grad_False_weights_only_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_serialization_requires_grad_True_weights_only_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_serialization_requires_grad_True_weights_only_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_serialization_requires_grad_True_weights_only_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_serialization_requires_grad_True_weights_only_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_serialization_requires_grad_True_weights_only_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_serialization_requires_grad_True_weights_only_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_softmax_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_softmax_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_softmax_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_squeeze_unsqueeze_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_squeeze_unsqueeze_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_squeeze_unsqueeze_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim2_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim2_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim3_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim3_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim4_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim4_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim4_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_noncontiguous_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_output_size_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_output_size_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_simple_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_simple_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_zero_numel_errors_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_zero_numel_errors_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_zero_numel_errors_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_then_from_padded_tensor_no_transform0213_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_inference_mode_interaction_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_inference_mode_interaction_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_inference_mode_interaction_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_abs__cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_abs_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_cos_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_gelu__cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_gelu_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_isinf_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_isnan_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_isneginf_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_isposinf_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_logical_not_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_neg_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_relu__cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_relu_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_sgn_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_silu__cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_silu_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_sin_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_sqrt_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_tanh__cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_tanh_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unbind_noncontiguous_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unbind_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unbind_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_inference_mode_interaction_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_inference_mode_interaction_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_inference_mode_interaction_cuda_float64, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_abs_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_accumulate_grad_different_strides_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_as_nested_tensor_propagates_gradients_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_backward_add_strided_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_backward_for_add_op_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_backward_for_sub_op_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_backward_sub_strided_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_dropout_backward_jagged_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_dropout_backward_strided_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_gelu_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_indexing_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_5d_size_128_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_5d_size_2_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_5d_size_32_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_5d_size_4_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_edge_case_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_1023_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_1024_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_128_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_256_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_2_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_32_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_4_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_512_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_513_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_masked_fill_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_bmm_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_bmm_gradcheck_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_from_list_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_from_mask_and_to_padded_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_from_padded_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_from_padded_fused_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_generates_leaf_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_linear_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_linear_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_linear_plus_transpose_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_matmul_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_matmul_gradcheck_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_reshape_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_reshape_gradcheck_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_softmax_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_squeeze_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_squeeze_gradcheck_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_to_padded_tensor_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_transpose_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_transpose_gradcheck_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_unsqueeze_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_unsqueeze_gradcheck_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_relu_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_selu_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_set_requires_grad_from_list_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_set_requires_grad_from_mask_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_split_with_sizes_flow_through_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_to_buffer_series_ops_grad_with_broadcast_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_unbind_flow_through_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_values_grad_with_broadcast_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_apply__cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_autograd_function_with_None_grad_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_binary_pointwise_broadcasting_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_binary_pointwise_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_binary_pointwise_transposed_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_binary_pointwise_with_nested_int_second_arg_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_broadcast_shapes_on_in_graph_constructed_njt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_chunk_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_compile_padded_dense_conversion_preserves_metadata_cache_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_compile_preserves_metadata_cache_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_compile_with_dynamic_max_seq_len_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_compile_with_dynamic_min_seq_len_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_compile_with_propagated_dynamic_max_seq_len_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_composite_op_in_inference_mode_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_composite_op_with_custom_mode_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_construction_from_list_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_copy__cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_device_dtype_transfer_updates_offsets_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_device_dtype_transfer_updates_offsets_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_dropout_inference_mode_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_dummy_mha_with_nt_use_legacy_api_False_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_dummy_mha_with_nt_use_legacy_api_True_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_flatten_decomp_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_flex_attention_converts_stacked_seq_indices_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_flex_attention_noncontig_with_holes_False_cross_attention_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_flex_attention_noncontig_with_holes_False_cross_attention_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_flex_attention_noncontig_with_holes_True_cross_attention_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_flex_attention_noncontig_with_holes_True_cross_attention_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_index_put_error_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_is_contiguous_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_is_same_size_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_True_components_require_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_True_components_require_grad_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_True_components_require_grad_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_True_components_require_grad_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_with_pinned_memory_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_padded_dense_conversion_kernels_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_padded_dense_conversion_kernels_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_padded_dense_conversion_kernels_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_False_values_is_view_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_False_values_is_view_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_False_values_is_view_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_False_values_is_view_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_False_values_is_view_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_False_values_is_view_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_True_values_is_view_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_True_values_is_view_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_True_values_is_view_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_True_values_is_view_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_True_values_is_view_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_True_values_is_view_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_2d_input_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_2d_input_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_2d_input_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_2d_input_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_operate_on_batch_dim_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_operate_on_batch_dim_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_operate_on_batch_dim_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_operate_on_batch_dim_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_reduce_ragged_idx_1_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_reduce_ragged_idx_1_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_reduce_ragged_idx_1_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_reduce_ragged_idx_1_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_with_lengths_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_with_lengths_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_with_lengths_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_with_lengths_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layout_under_torch_dispatch_mode_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_shape_empty_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_shape_randn_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_value_empty_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_value_full_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_value_ones_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_value_rand_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_value_randint_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_value_randn_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_value_zeros_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_linear_nt_dim_3_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_linear_nt_dim_4_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_linear_nt_dim_5_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_narrow_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_nested_tensor_activation_checkpoint_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_nested_tensor_from_jagged_fx_trace_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_nested_tensor_from_jagged_pass_min_max_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_nested_tensor_from_jagged_pass_min_max_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_njt_cat_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_noncontiguous_pointwise_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_noncontiguous_to_noncontig_transposed_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_noncontiguous_to_noncontig_transposed_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_noncontiguous_to_noncontig_transposed_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_noncontiguous_to_noncontig_with_holes_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_noncontiguous_to_noncontig_with_holes_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_noncontiguous_to_noncontig_with_holes_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_permute_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_pin_memory_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_profiler_sequence_nr_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_record_stream_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_reshape_decomp_requires_grad_False_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_reshape_decomp_requires_grad_True_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_autocast_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_backwards_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_backwards_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_backwards_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_compile_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_compile_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_compile_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_flop_counter_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_with_constant_sequence_length_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_with_constant_sequence_length_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_with_constant_sequence_length_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_with_packed_in_proj_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_with_packed_in_proj_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_with_packed_in_proj_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_contig_weights_only_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_contig_weights_only_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_noncontig_transposed_weights_only_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_noncontig_transposed_weights_only_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_noncontig_with_holes_weights_only_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_noncontig_with_holes_weights_only_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_1_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_1_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_1_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_1_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_1_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_1_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_1_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_1_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_2_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_2_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_2_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_2_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_requires_grad_False_components_require_grad_False_log_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_requires_grad_False_components_require_grad_False_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_requires_grad_False_components_require_grad_True_log_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_requires_grad_False_components_require_grad_True_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_requires_grad_True_components_require_grad_False_log_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_requires_grad_True_components_require_grad_False_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_requires_grad_True_components_require_grad_True_log_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_requires_grad_True_components_require_grad_True_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_transpose_non_ragged_dim_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_transpose_non_ragged_dim_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_transpose_non_ragged_dim_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_transpose_non_ragged_dim_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_with_lengths_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_with_lengths_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_with_lengths_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_with_lengths_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_reduce_batch_dim_requires_grad_False_components_require_grad_False_log_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_reduce_batch_dim_requires_grad_False_components_require_grad_False_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_reduce_batch_dim_requires_grad_False_components_require_grad_True_log_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_reduce_batch_dim_requires_grad_False_components_require_grad_True_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_reduce_batch_dim_requires_grad_True_components_require_grad_False_log_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_reduce_batch_dim_requires_grad_True_components_require_grad_False_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_reduce_batch_dim_requires_grad_True_components_require_grad_True_log_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_reduce_batch_dim_requires_grad_True_components_require_grad_True_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_specialize_dynamic_shape_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_specialize_dynamic_shape_recompile_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_split_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_split_with_sizes_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_squeeze_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_tensor_attributes_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_threshold_backward_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_copy_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_dtype_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_2_requires_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_2_requires_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_2_requires_grad_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_2_requires_grad_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_2_requires_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_2_requires_grad_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_3_requires_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_3_requires_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_3_requires_grad_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_3_requires_grad_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_3_requires_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_3_requires_grad_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_4_requires_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_4_requires_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_4_requires_grad_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_4_requires_grad_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_4_requires_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_4_requires_grad_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_2_requires_grad_False_cuda_bool, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_2_requires_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_2_requires_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_2_requires_grad_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_2_requires_grad_True_cuda_bool, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_2_requires_grad_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_2_requires_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_2_requires_grad_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_3_requires_grad_False_cuda_bool, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_3_requires_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_3_requires_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_3_requires_grad_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_3_requires_grad_True_cuda_bool, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_3_requires_grad_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_3_requires_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_3_requires_grad_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_4_requires_grad_False_cuda_bool, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_4_requires_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_4_requires_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_4_requires_grad_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_4_requires_grad_True_cuda_bool, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_4_requires_grad_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_4_requires_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_4_requires_grad_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unary_pointwise_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unary_pointwise_transposed_inputs_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_backward_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_backward_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_backward_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_ragged_idx_0_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_ragged_idx_1_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_ragged_idx_2_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_ragged_idx_3_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_ragged_idx_equals_2_bad_dim_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_transpose_ragged_idx_2_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_transpose_ragged_idx_3_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_transpose_ragged_idx_last_dim_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unsafe_view_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_view_ragged_idx_not_one_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_views_inherit_ragged_dim_cuda, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward___radd___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward___rdiv___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward___rmod___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward___rmul___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward___rpow___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward___rsub___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_abs_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_acos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_acosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_add_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_angle_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_asin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_asinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_atan2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_atan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_atanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_bfloat16_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_bmm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_cdouble_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_ceil_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_cfloat_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_chalf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_chunk_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_clamp_max_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_clamp_min_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_clone_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_complex_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_conj_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_conj_physical_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_copysign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_cos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_cosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_deg2rad_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_digamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_div_floor_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_div_no_rounding_mode_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_div_trunc_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_double_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_erf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_erfc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_erfinv_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_exp2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_exp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_expm1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_fill_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_float_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_float_power_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_floor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_fmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_fmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_fmod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_frac_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_frexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_half_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_hypot_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_i0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_index_put_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_ldexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_lgamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_linalg_vector_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_log10_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_log1p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_log2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_log_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_logaddexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_logit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_logsumexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_select_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_matmul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_max_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_max_reduction_with_dim_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_maximum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_min_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_min_reduction_with_dim_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_minimum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_mul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nan_to_num_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nanmean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nansum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_narrow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_neg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_celu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_elu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_embedding_bag_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_embedding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_hardshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_hardsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_hardtanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_linear_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_logsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_mish_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_prelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_relu6_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_relu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_rms_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_rrelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_selu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_silu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_softplus_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_softshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_softsign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_tanhshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_threshold_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_polar_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_polygamma_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_polygamma_polygamma_n_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_polygamma_polygamma_n_2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_polygamma_polygamma_n_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_polygamma_polygamma_n_4_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_positive_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_pow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_rad2deg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_real_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_reciprocal_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_remainder_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_round_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_round_decimals_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_round_decimals_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_round_decimals_neg_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_rsqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_rsub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_select_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sgn_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sinc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_entr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_erfcx_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_i0e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_i1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_i1e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_log_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_ndtri_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_xlog1py_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_split_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_split_with_sizes_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_square_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_squeeze_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_std_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_tan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_tanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_to_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_true_divide_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_trunc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_unflatten_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_unsqueeze_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_var_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_where_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_xlogy_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___radd___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___rdiv___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___rmod___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___rmul___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___rpow___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___rsub___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_abs_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_acos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_acosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_add_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_angle_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_asin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_asinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_atan2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_atan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_atanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_bfloat16_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_bmm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_cdouble_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_ceil_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_cfloat_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_chalf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_chunk_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_clamp_max_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_clamp_min_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_clone_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_complex_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_conj_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_conj_physical_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_copysign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_cos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_cosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_deg2rad_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_digamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_div_floor_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_div_no_rounding_mode_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_div_trunc_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_double_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_erf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_erfc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_erfinv_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_exp2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_exp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_expm1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_fill_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_float_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_float_power_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_floor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_fmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_fmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_fmod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_frac_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_frexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_half_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_hypot_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_i0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_index_put_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_ldexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_lgamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_linalg_vector_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_log10_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_log1p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_log2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_log_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_logaddexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_logit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_logsumexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_select_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_matmul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_max_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_max_reduction_with_dim_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_maximum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_min_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_min_reduction_with_dim_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_minimum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_mul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nan_to_num_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nanmean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nansum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_narrow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_neg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_celu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_elu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_embedding_bag_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_embedding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_hardshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_hardsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_hardtanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_linear_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_logsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_mish_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_prelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_relu6_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_relu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_rms_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_rrelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_selu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_silu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_softplus_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_softshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_softsign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_tanhshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_threshold_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_polar_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_polygamma_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_polygamma_polygamma_n_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_polygamma_polygamma_n_2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_polygamma_polygamma_n_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_polygamma_polygamma_n_4_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_positive_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_pow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_rad2deg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_real_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_reciprocal_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_remainder_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_round_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_round_decimals_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_round_decimals_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_round_decimals_neg_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_rsqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_rsub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_select_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sgn_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sinc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_entr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_erfcx_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_i0e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_i1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_i1e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_log_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_ndtri_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_xlog1py_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_split_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_split_with_sizes_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_square_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_squeeze_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_std_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_tan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_tanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_to_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_true_divide_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_trunc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_unflatten_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_unsqueeze_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_var_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_where_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_xlogy_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___radd___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___rdiv___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___rmod___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___rmul___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___rpow___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___rsub___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_abs_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_acos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_acosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_add_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_all_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_angle_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_any_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_argmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_argmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_asin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_asinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_atan2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_atan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_atanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_bfloat16_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_bmm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_bool_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_byte_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_cdouble_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_ceil_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_cfloat_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_chalf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_char_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_chunk_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_clamp_max_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_clamp_min_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_clone_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_complex_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_conj_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_conj_physical_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_copysign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_cos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_cosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_count_nonzero_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_deg2rad_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_digamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_div_floor_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_div_no_rounding_mode_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_div_trunc_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_double_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_eq_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_erf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_erfc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_erfinv_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_exp2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_exp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_expm1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_fill_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_float_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_float_power_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_floor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_floor_divide_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_fmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_fmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_fmod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_frac_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_frexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_ge_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_gt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_half_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_hash_tensor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_heaviside_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_hypot_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_i0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_igamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_igammac_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_index_put_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_int_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isclose_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isfinite_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isinf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isnan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isneginf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isposinf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isreal_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_jiterator_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_jiterator_binary_return_by_ref_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_jiterator_unary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_ldexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_le_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_lgamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_linalg_vector_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_log10_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_log1p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_log2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_log_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_logaddexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_logical_and_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_logical_not_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_logical_or_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_logical_xor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_logit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_long_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_lt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_argmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_argmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_logsumexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_select_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_matmul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_max_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_max_reduction_with_dim_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_maximum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_min_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_min_reduction_with_dim_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_minimum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_mul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nan_to_num_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nanmean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nansum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_narrow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_ne_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_neg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nextafter_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_celu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_elu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_embedding_bag_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_embedding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_hardshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_hardsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_hardtanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_linear_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_logsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_mish_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_prelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_relu6_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_relu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_rms_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_rrelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_selu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_silu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_softplus_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_softshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_softsign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_tanhshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_threshold_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_polar_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_polygamma_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_polygamma_polygamma_n_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_polygamma_polygamma_n_2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_polygamma_polygamma_n_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_polygamma_polygamma_n_4_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_positive_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_pow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_rad2deg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_real_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_reciprocal_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_remainder_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_round_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_round_decimals_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_round_decimals_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_round_decimals_neg_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_rsqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_rsub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_select_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sgn_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_short_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_signbit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sinc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_airy_ai_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_bessel_j0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_bessel_j1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_bessel_y0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_bessel_y1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_chebyshev_polynomial_t_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_chebyshev_polynomial_u_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_chebyshev_polynomial_v_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_chebyshev_polynomial_w_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_entr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_erfcx_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_hermite_polynomial_h_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_hermite_polynomial_he_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_i0e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_i1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_i1e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_laguerre_polynomial_l_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_legendre_polynomial_p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_log_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_modified_bessel_i0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_modified_bessel_i1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_modified_bessel_k0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_modified_bessel_k1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_ndtri_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_scaled_modified_bessel_k0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_scaled_modified_bessel_k1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_spherical_bessel_j0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_xlog1py_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_zeta_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_split_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_split_with_sizes_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_square_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_squeeze_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_std_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_tan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_tanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_to_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_true_divide_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_trunc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_unflatten_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_unsqueeze_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_var_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_where_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_xlogy_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward___radd___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward___rdiv___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward___rmod___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward___rmul___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward___rpow___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward___rsub___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_abs_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_acos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_acosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_add_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_all_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_angle_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_any_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_argmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_argmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_asin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_asinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_atan2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_atan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_atanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_bfloat16_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_bmm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_bool_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_byte_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_cdouble_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_ceil_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_cfloat_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_chalf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_char_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_chunk_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_clamp_max_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_clamp_min_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_clone_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_complex_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_conj_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_conj_physical_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_copysign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_cos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_cosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_count_nonzero_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_deg2rad_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_digamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_div_floor_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_div_no_rounding_mode_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_div_trunc_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_double_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_eq_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_erf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_erfc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_erfinv_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_exp2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_exp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_expm1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_fill_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_float_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_float_power_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_floor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_floor_divide_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_fmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_fmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_fmod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_frac_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_frexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_ge_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_gt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_half_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_hash_tensor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_heaviside_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_hypot_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_i0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_igamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_igammac_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_index_put_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_int_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isclose_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isfinite_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isinf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isnan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isneginf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isposinf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isreal_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_jiterator_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_jiterator_binary_return_by_ref_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_jiterator_unary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_ldexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_le_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_lgamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_linalg_vector_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_log10_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_log1p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_log2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_log_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logaddexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logical_and_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logical_not_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logical_or_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logical_xor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_long_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_lt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_argmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_argmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_logsumexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_select_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_matmul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_max_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_max_reduction_with_dim_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_maximum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_min_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_min_reduction_with_dim_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_minimum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_mul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nan_to_num_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nanmean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nansum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_narrow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_ne_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_neg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nextafter_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_celu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_elu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_embedding_bag_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_embedding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_hardshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_hardsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_hardtanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_linear_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_logsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_mish_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_prelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_relu6_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_relu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_rms_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_rrelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_selu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_silu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_softplus_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_softshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_softsign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_tanhshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_threshold_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polar_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polygamma_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polygamma_polygamma_n_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polygamma_polygamma_n_2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polygamma_polygamma_n_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polygamma_polygamma_n_4_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_positive_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_pow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_rad2deg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_real_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_reciprocal_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_remainder_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_round_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_round_decimals_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_round_decimals_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_round_decimals_neg_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_rsqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_rsub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_select_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sgn_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_short_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_signbit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sinc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_airy_ai_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_bessel_j0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_bessel_j1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_bessel_y0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_bessel_y1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_chebyshev_polynomial_t_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_chebyshev_polynomial_u_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_chebyshev_polynomial_v_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_chebyshev_polynomial_w_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_entr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_erfcx_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_hermite_polynomial_h_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_hermite_polynomial_he_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_i0e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_i1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_i1e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_laguerre_polynomial_l_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_legendre_polynomial_p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_log_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_modified_bessel_i0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_modified_bessel_i1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_modified_bessel_k0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_modified_bessel_k1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_ndtri_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_scaled_modified_bessel_k0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_scaled_modified_bessel_k1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_spherical_bessel_j0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_xlog1py_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_zeta_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_split_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_split_with_sizes_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_square_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_squeeze_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_std_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_tan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_tanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_to_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_true_divide_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_trunc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_unflatten_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_unsqueeze_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_var_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_where_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_xlogy_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_nested_tensor_input_mutation_backward_cuda, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_nested_tensor_non_contiguous_mutation_cuda 2025-08-14T22:29:28.0688201Z 2025-08-14T22:29:31.2152854Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:29:31.2155121Z import pkg_resources 2025-08-14T22:29:31.6027314Z Running xpu/test_gemm 1/1 ... [2025-08-14 22:29:31.602131] 2025-08-14T22:29:31.6027970Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:29:31.6029367Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'xpu/test_gemm.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:29:31.602546] 2025-08-14T22:29:32.0018390Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:29:32.0019970Z import pkg_resources 2025-08-14T22:29:32.3803967Z Running torch_np/test_ufuncs_basic 1/1 ... [2025-08-14 22:29:32.379818] 2025-08-14T22:29:32.3804654Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:29:32.3806794Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/test_ufuncs_basic.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:29:32.380286] 2025-08-14T22:29:36.3262211Z 2025-08-14T22:29:36.3263142Z xpu/test_gemm 1/1 was successful, full logs can be found in artifacts with path test/test-reports/xpu.test_gemm_1.1_2e73dd04997b20e6_.log 2025-08-14T22:29:36.3263780Z Running 0 items in this shard: 2025-08-14T22:29:36.3263972Z 2025-08-14T22:29:37.6080626Z 2025-08-14T22:29:37.6081894Z torch_np/test_ufuncs_basic 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.test_ufuncs_basic_1.1_97cbd1fcd97e340b_.log 2025-08-14T22:29:37.6222771Z Running 371 items in this shard: test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_scalar_ufunc0, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_dtype_casting_casting_equiv_ufunc0_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_dtype_casting_casting_equiv_ufunc0_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_dtype_casting_casting_equiv_ufunc0_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_dtype_casting_casting_no_ufunc0_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_dtype_casting_casting_no_ufunc0_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_dtype_casting_casting_no_ufunc0_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_dtype_casting_casting_safe_ufunc0_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_dtype_casting_casting_safe_ufunc0_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_dtype_casting_casting_safe_ufunc0_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_dtype_casting_casting_same_kind_ufunc0_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_dtype_casting_casting_same_kind_ufunc0_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_dtype_casting_casting_same_kind_ufunc0_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_dtype_casting_casting_unsafe_ufunc0_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_dtype_casting_casting_unsafe_ufunc0_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_dtype_casting_casting_unsafe_ufunc0_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_dtype_ufunc0, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_out_broadcast_ufunc0, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_out_casting_casting_equiv_ufunc0_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_out_casting_casting_equiv_ufunc0_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_out_casting_casting_equiv_ufunc0_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_out_casting_casting_no_ufunc0_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_out_casting_casting_no_ufunc0_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_out_casting_casting_no_ufunc0_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_out_casting_casting_safe_ufunc0_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_out_casting_casting_safe_ufunc0_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_out_casting_casting_safe_ufunc0_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_out_casting_casting_same_kind_ufunc0_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_out_casting_casting_same_kind_ufunc0_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_out_casting_casting_same_kind_ufunc0_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_out_casting_casting_unsafe_ufunc0_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_out_casting_casting_unsafe_ufunc0_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestUnaryUfuncs::test_x_and_out_casting_casting_unsafe_ufunc0_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_scalar_ufunc0, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_scalar_ufunc1, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_scalar_ufunc10, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_scalar_ufunc11, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_scalar_ufunc12, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_scalar_ufunc13, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_scalar_ufunc14, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_scalar_ufunc15, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_scalar_ufunc16, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_scalar_ufunc2, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_scalar_ufunc3, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_scalar_ufunc4, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_scalar_ufunc5, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_scalar_ufunc6, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_scalar_ufunc7, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_scalar_ufunc8, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_scalar_ufunc9, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_vector_vs_scalar_ufunc0, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_vector_vs_scalar_ufunc1, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_vector_vs_scalar_ufunc10, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_vector_vs_scalar_ufunc11, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_vector_vs_scalar_ufunc12, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_vector_vs_scalar_ufunc13, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_vector_vs_scalar_ufunc14, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_vector_vs_scalar_ufunc15, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_vector_vs_scalar_ufunc16, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_vector_vs_scalar_ufunc2, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_vector_vs_scalar_ufunc3, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_vector_vs_scalar_ufunc4, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_vector_vs_scalar_ufunc5, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_vector_vs_scalar_ufunc6, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_vector_vs_scalar_ufunc7, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_vector_vs_scalar_ufunc8, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_vector_vs_scalar_ufunc9, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_broadcast_ufunc0, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_broadcast_ufunc1, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_broadcast_ufunc10, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_broadcast_ufunc11, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_broadcast_ufunc12, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_broadcast_ufunc13, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_broadcast_ufunc14, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_broadcast_ufunc15, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_broadcast_ufunc16, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_broadcast_ufunc2, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_broadcast_ufunc3, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_broadcast_ufunc4, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_broadcast_ufunc5, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_broadcast_ufunc6, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_broadcast_ufunc7, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_broadcast_ufunc8, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_broadcast_ufunc9, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc0_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc0_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc0_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc10_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc10_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc10_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc11_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc11_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc11_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc12_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc12_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc12_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc13_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc13_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc13_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc14_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc14_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc14_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc15_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc15_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc15_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc16_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc16_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc16_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc1_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc1_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc1_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc2_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc2_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc2_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc3_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc3_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc3_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc4_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc4_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc4_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc5_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc5_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc5_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc6_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc6_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc6_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc7_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc7_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc7_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc8_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc8_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc8_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc9_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc9_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_equiv_ufunc9_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc0_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc0_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc0_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc10_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc10_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc10_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc11_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc11_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc11_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc12_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc12_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc12_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc13_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc13_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc13_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc14_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc14_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc14_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc15_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc15_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc15_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc16_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc16_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc16_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc1_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc1_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc1_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc2_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc2_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc2_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc3_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc3_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc3_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc4_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc4_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc4_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc5_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc5_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc5_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc6_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc6_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc6_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc7_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc7_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc7_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc8_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc8_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc8_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc9_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc9_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_no_ufunc9_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc0_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc0_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc0_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc10_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc10_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc10_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc11_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc11_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc11_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc12_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc12_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc12_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc13_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc13_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc13_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc14_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc14_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc14_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc15_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc15_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc15_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc16_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc16_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc16_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc1_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc1_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc1_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc2_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc2_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc2_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc3_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc3_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc3_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc4_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc4_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc4_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc5_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc5_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc5_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc6_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc6_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc6_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc7_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc7_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc7_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc8_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc8_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc8_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc9_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc9_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_safe_ufunc9_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc0_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc0_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc0_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc10_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc10_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc10_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc11_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc11_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc11_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc12_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc12_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc12_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc13_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc13_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc13_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc14_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc14_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc14_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc15_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc15_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc15_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc16_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc16_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc16_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc1_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc1_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc1_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc2_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc2_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc2_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc3_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc3_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc3_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc4_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc4_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc4_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc5_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc5_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc5_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc6_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc6_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc6_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc7_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc7_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc7_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc8_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc8_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc8_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc9_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc9_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_same_kind_ufunc9_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc0_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc0_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc0_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc10_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc10_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc10_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc11_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc11_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc11_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc12_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc12_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc12_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc13_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc13_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc13_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc14_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc14_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc14_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc15_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc15_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc15_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc16_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc16_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc16_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc1_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc1_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc1_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc2_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc2_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc2_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc3_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc3_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc3_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc4_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc4_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc4_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc5_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc5_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc5_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc6_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc6_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc6_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc7_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc7_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc7_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc8_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc8_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc8_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc9_out_dtype_complex128, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc9_out_dtype_float32, test/torch_np/test_ufuncs_basic.py::TestBinaryUfuncs::test_xy_and_out_casting_casting_unsafe_ufunc9_out_dtype_float64, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_basic_ufunc0_op0_iop0, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_basic_ufunc1_op1_iop1, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_basic_ufunc2_op2_iop2, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_array_bcast_ufunc0_op0_iop0, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_array_bcast_ufunc1_op1_iop1, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_array_bcast_ufunc2_op2_iop2, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_array_ufunc0_op0_iop0_other_dtype0, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_array_ufunc0_op0_iop0_other_dtype1, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_array_ufunc0_op0_iop0_other_dtype2, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_array_ufunc0_op0_iop0_other_dtype3, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_array_ufunc1_op1_iop1_other_dtype0, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_array_ufunc1_op1_iop1_other_dtype1, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_array_ufunc1_op1_iop1_other_dtype2, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_array_ufunc1_op1_iop1_other_dtype3, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_array_ufunc2_op2_iop2_other_dtype0, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_array_ufunc2_op2_iop2_other_dtype1, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_array_ufunc2_op2_iop2_other_dtype2, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_array_ufunc2_op2_iop2_other_dtype3, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_scalar_ufunc0_op0_iop0_other_dtype0, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_scalar_ufunc0_op0_iop0_other_dtype1, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_scalar_ufunc0_op0_iop0_other_dtype2, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_scalar_ufunc0_op0_iop0_other_dtype3, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_scalar_ufunc1_op1_iop1_other_dtype0, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_scalar_ufunc1_op1_iop1_other_dtype1, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_scalar_ufunc1_op1_iop1_other_dtype2, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_scalar_ufunc1_op1_iop1_other_dtype3, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_scalar_ufunc2_op2_iop2_other_dtype0, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_scalar_ufunc2_op2_iop2_other_dtype1, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_scalar_ufunc2_op2_iop2_other_dtype2, test/torch_np/test_ufuncs_basic.py::TestNdarrayDunderVsUfunc::test_other_scalar_ufunc2_op2_iop2_other_dtype3, test/torch_np/test_ufuncs_basic.py::TestUfuncDtypeKwd::test_binary_ufunc_dtype, test/torch_np/test_ufuncs_basic.py::TestUfuncDtypeKwd::test_binary_ufunc_dtype_and_out 2025-08-14T22:29:37.6361489Z 2025-08-14T22:29:40.4875883Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:29:40.4878490Z import pkg_resources 2025-08-14T22:29:40.8783208Z Running test_meta 1/1 ... [2025-08-14 22:29:40.877776] 2025-08-14T22:29:40.8783679Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:29:40.8785683Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_meta.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:29:40.878209] 2025-08-14T22:29:41.7122764Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:29:41.7124189Z import pkg_resources 2025-08-14T22:29:42.0969731Z Running test_utils_filelock 1/1 ... [2025-08-14 22:29:42.096489] 2025-08-14T22:29:42.0971991Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:29:42.0972983Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_utils_filelock.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:29:42.096880] 2025-08-14T22:29:46.5704367Z 2025-08-14T22:29:46.5705404Z test_utils_filelock 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_utils_filelock_1.1_f776f9c8ed7aa0d4_.log 2025-08-14T22:29:46.5706767Z Running 2 items in this shard: test/test_utils_filelock.py::TestFileLock::test_no_crash, test/test_utils_filelock.py::TestFileLock::test_sequencing 2025-08-14T22:29:46.5707311Z 2025-08-14T22:29:50.6935439Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:29:50.6936905Z import pkg_resources 2025-08-14T22:29:51.0833584Z Running torch_np/test_random 1/1 ... [2025-08-14 22:29:51.082844] 2025-08-14T22:29:51.0834157Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:29:51.0836811Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/test_random.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:29:51.083267] 2025-08-14T22:29:55.7573186Z 2025-08-14T22:29:55.7574011Z torch_np/test_random 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.test_random_1.1_87bbaeba6ffe2b3a_.log 2025-08-14T22:29:55.7587478Z Running 41 items in this shard: test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_False_func0, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_False_func1, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_False_func2, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_False_func3, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_False_func6, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_False_func7, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_False_random_random, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_False_random_sample, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_True_func0, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_True_func1, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_True_func2, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_True_func3, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_True_func6, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_True_func7, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_True_random_random, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_True_random_sample, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_False_func0, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_False_func1, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_False_func2, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_False_func3, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_False_func6, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_False_func7, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_False_random_random, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_False_random_sample, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_True_func0, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_True_func1, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_True_func2, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_True_func3, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_True_func6, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_True_func7, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_True_random_random, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_True_random_sample, test/torch_np/test_random.py::TestShuffle::test_1d_use_numpy_False, test/torch_np/test_random.py::TestShuffle::test_1d_use_numpy_True, test/torch_np/test_random.py::TestShuffle::test_2d_use_numpy_False, test/torch_np/test_random.py::TestShuffle::test_2d_use_numpy_True, test/torch_np/test_random.py::TestShuffle::test_shuffle_list_use_numpy_False, test/torch_np/test_random.py::TestShuffle::test_shuffle_list_use_numpy_True, test/torch_np/test_random.py::TestChoice::test_choice_use_numpy_False, test/torch_np/test_random.py::TestChoice::test_choice_use_numpy_True, test/torch_np/test_random.py::TestNumpyGlobal::test_numpy_global 2025-08-14T22:29:55.7600462Z 2025-08-14T22:29:59.7836681Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:29:59.7838288Z import pkg_resources 2025-08-14T22:30:00.1605800Z Running profiler/test_kineto 1/1 ... [2025-08-14 22:30:00.160090] 2025-08-14T22:30:00.1606236Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:30:00.1609896Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_kineto.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:30:00.160472] 2025-08-14T22:30:04.5334110Z 2025-08-14T22:30:04.5335220Z profiler/test_kineto 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_kineto_1.1_be7a79ae7a1c29a8_.log 2025-08-14T22:30:04.5336387Z Running 1 items in this shard: test/profiler/test_kineto.py::SimpleKinetoInitializationTest::test_kineto_profiler_with_environment_variable 2025-08-14T22:30:04.5336971Z 2025-08-14T22:30:08.6370539Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:30:08.6371857Z import pkg_resources 2025-08-14T22:30:09.0340376Z Running test_compile_benchmark_util 1/1 ... [2025-08-14 22:30:09.033597] 2025-08-14T22:30:09.0340797Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:30:09.0344421Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_compile_benchmark_util.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:30:09.034052] 2025-08-14T22:30:13.8088578Z 2025-08-14T22:30:13.8089370Z test_compile_benchmark_util 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_compile_benchmark_util_1.1_f96d6f04f6075702_.log 2025-08-14T22:30:13.8090407Z Running 1 items in this shard: test/test_compile_benchmark_util.py::TestCompileBenchmarkUtil::test_training_and_inference 2025-08-14T22:30:13.8090896Z 2025-08-14T22:30:17.9312743Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:30:17.9314062Z import pkg_resources 2025-08-14T22:30:18.3297603Z Running test_jiterator 1/1 ... [2025-08-14 22:30:18.329350] 2025-08-14T22:30:18.3298009Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:30:18.3301725Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_jiterator.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:30:18.329816] 2025-08-14T22:30:21.0514099Z 2025-08-14T22:30:21.0514986Z test_typing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_typing_1.1_266fde289d6a6dd0_.log 2025-08-14T22:30:21.0519459Z Running 18 items in this shard: test/test_typing.py::TestTyping::test_fail_arithmetic_ops.py, test/test_typing.py::TestTyping::test_fail_creation_ops.py, test/test_typing.py::TestTyping::test_fail_random.py, test/test_typing.py::TestTyping::test_fail_torch_size.py, test/test_typing.py::TestTyping::test_reveal_module_list.py, test/test_typing.py::TestTyping::test_reveal_namedtuple.py, test/test_typing.py::TestTyping::test_reveal_opt_size.py, test/test_typing.py::TestTyping::test_reveal_size.py, test/test_typing.py::TestTyping::test_reveal_tensor_constructors.py, test/test_typing.py::TestTyping::test_reveal_tensor_copy.py, test/test_typing.py::TestTyping::test_reveal_tensor_sampling.py, test/test_typing.py::TestTyping::test_reveal_torch_optim.py, test/test_typing.py::TestTyping::test_success_arithmetic_ops.py, test/test_typing.py::TestTyping::test_success_creation_ops.py, test/test_typing.py::TestTyping::test_success_cuda_steam.py, test/test_typing.py::TestTyping::test_success_distributions.py, test/test_typing.py::TestTyping::test_success_math_ops.py, test/test_typing.py::TestTyping::test_success_torch_size.py 2025-08-14T22:30:21.0523421Z 2025-08-14T22:30:23.4054156Z 2025-08-14T22:30:23.4055108Z test_jiterator 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_jiterator_1.1_aa71ca629a29c84e_.log 2025-08-14T22:30:23.4169754Z Running 289 items in this shard: test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_bfloat16_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_bfloat16_complex128, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_bfloat16_complex64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_bfloat16_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_bfloat16_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_bfloat16_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_bfloat16_int16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_bfloat16_int32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_bfloat16_int64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_bfloat16_int8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_bfloat16_uint8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_complex128_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_complex128_complex128, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_complex128_complex64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_complex128_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_complex128_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_complex128_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_complex128_int16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_complex128_int32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_complex128_int64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_complex128_int8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_complex128_uint8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_complex64_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_complex64_complex128, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_complex64_complex64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_complex64_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_complex64_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_complex64_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_complex64_int16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_complex64_int32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_complex64_int64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_complex64_int8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_complex64_uint8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float16_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float16_complex128, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float16_complex64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float16_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float16_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float16_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float16_int16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float16_int32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float16_int64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float16_int8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float16_uint8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float32_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float32_complex128, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float32_complex64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float32_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float32_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float32_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float32_int16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float32_int32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float32_int64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float32_int8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float32_uint8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float64_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float64_complex128, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float64_complex64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float64_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float64_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float64_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float64_int16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float64_int32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float64_int64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float64_int8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_float64_uint8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int16_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int16_complex128, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int16_complex64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int16_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int16_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int16_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int16_int16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int16_int32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int16_int64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int16_int8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int16_uint8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int32_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int32_complex128, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int32_complex64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int32_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int32_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int32_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int32_int16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int32_int32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int32_int64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int32_int8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int32_uint8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int64_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int64_complex128, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int64_complex64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int64_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int64_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int64_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int64_int16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int64_int32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int64_int64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int64_int8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int64_uint8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int8_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int8_complex128, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int8_complex64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int8_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int8_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int8_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int8_int16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int8_int32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int8_int64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int8_int8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_int8_uint8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_uint8_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_uint8_complex128, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_uint8_complex64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_uint8_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_uint8_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_uint8_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_uint8_int16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_uint8_int32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_uint8_int64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_uint8_int8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_contiguous_shape_strides0_cuda_uint8_uint8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_bfloat16_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_bfloat16_complex128, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_bfloat16_complex64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_bfloat16_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_bfloat16_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_bfloat16_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_bfloat16_int16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_bfloat16_int32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_bfloat16_int64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_bfloat16_int8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_bfloat16_uint8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_complex128_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_complex128_complex128, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_complex128_complex64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_complex128_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_complex128_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_complex128_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_complex128_int16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_complex128_int32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_complex128_int64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_complex128_int8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_complex128_uint8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_complex64_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_complex64_complex128, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_complex64_complex64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_complex64_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_complex64_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_complex64_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_complex64_int16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_complex64_int32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_complex64_int64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_complex64_int8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_complex64_uint8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float16_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float16_complex128, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float16_complex64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float16_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float16_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float16_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float16_int16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float16_int32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float16_int64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float16_int8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float16_uint8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float32_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float32_complex128, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float32_complex64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float32_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float32_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float32_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float32_int16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float32_int32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float32_int64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float32_int8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float32_uint8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float64_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float64_complex128, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float64_complex64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float64_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float64_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float64_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float64_int16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float64_int32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float64_int64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float64_int8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_float64_uint8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int16_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int16_complex128, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int16_complex64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int16_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int16_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int16_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int16_int16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int16_int32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int16_int64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int16_int8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int16_uint8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int32_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int32_complex128, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int32_complex64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int32_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int32_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int32_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int32_int16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int32_int32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int32_int64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int32_int8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int32_uint8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int64_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int64_complex128, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int64_complex64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int64_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int64_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int64_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int64_int16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int64_int32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int64_int64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int64_int8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int64_uint8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int8_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int8_complex128, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int8_complex64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int8_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int8_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int8_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int8_int16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int8_int32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int8_int64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int8_int8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_int8_uint8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_uint8_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_uint8_complex128, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_uint8_complex64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_uint8_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_uint8_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_uint8_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_uint8_int16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_uint8_int32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_uint8_int64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_uint8_int8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_all_dtype_noncontiguous_shape_strides0_cuda_uint8_uint8, test/test_jiterator.py::TestPythonJiteratorCUDA::test_bool_extra_args_is_train_False_cuda, test/test_jiterator.py::TestPythonJiteratorCUDA::test_bool_extra_args_is_train_True_cuda, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha2_beta2_cuda_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha2_beta2_cuda_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha2_beta2_cuda_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha2_beta2_cuda_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha2_beta_-4_2_cuda_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha2_beta_-4_2_cuda_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha2_beta_-4_2_cuda_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha2_beta_-4_2_cuda_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha2_beta_3_cuda_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha2_beta_3_cuda_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha2_beta_3_cuda_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha2_beta_3_cuda_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_-1_beta2_cuda_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_-1_beta2_cuda_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_-1_beta2_cuda_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_-1_beta2_cuda_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_-1_beta_-4_2_cuda_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_-1_beta_-4_2_cuda_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_-1_beta_-4_2_cuda_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_-1_beta_-4_2_cuda_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_-1_beta_3_cuda_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_-1_beta_3_cuda_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_-1_beta_3_cuda_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_-1_beta_3_cuda_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_2_0_beta2_cuda_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_2_0_beta2_cuda_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_2_0_beta2_cuda_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_2_0_beta2_cuda_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_2_0_beta_-4_2_cuda_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_2_0_beta_-4_2_cuda_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_2_0_beta_-4_2_cuda_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_2_0_beta_-4_2_cuda_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_2_0_beta_3_cuda_bfloat16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_2_0_beta_3_cuda_float16, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_2_0_beta_3_cuda_float32, test/test_jiterator.py::TestPythonJiteratorCUDA::test_extra_args_alpha_2_0_beta_3_cuda_float64, test/test_jiterator.py::TestPythonJiteratorCUDA::test_invalid_function_name_code_string_template T my _kernel(T x) { return x; }_cuda, test/test_jiterator.py::TestPythonJiteratorCUDA::test_invalid_function_name_code_string_template Tmy_kernel(T x) { return x; }_cuda, test/test_jiterator.py::TestPythonJiteratorCUDA::test_multiple_functors_cuda, test/test_jiterator.py::TestPythonJiteratorCUDA::test_various_num_inputs_num_inputs_1_cuda, test/test_jiterator.py::TestPythonJiteratorCUDA::test_various_num_inputs_num_inputs_5_cuda, test/test_jiterator.py::TestPythonJiteratorCUDA::test_various_num_inputs_num_inputs_8_cuda, test/test_jiterator.py::TestPythonJiteratorCUDA::test_various_num_outputs_num_outputs_1_cuda, test/test_jiterator.py::TestPythonJiteratorCUDA::test_various_num_outputs_num_outputs_4_cuda, test/test_jiterator.py::TestPythonJiteratorCUDA::test_various_num_outputs_num_outputs_8_cuda 2025-08-14T22:30:23.4289879Z 2025-08-14T22:30:25.1648271Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:30:25.1650871Z import pkg_resources 2025-08-14T22:30:25.5464808Z Running test_optim 1/1 ... [2025-08-14 22:30:25.545939] 2025-08-14T22:30:25.5465200Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:30:25.5468912Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_optim.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:30:25.546411] 2025-08-14T22:30:27.5436355Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:30:27.5437698Z import pkg_resources 2025-08-14T22:30:27.9273046Z Running test_decomp 1/16 ... [2025-08-14 22:30:27.926774] 2025-08-14T22:30:27.9273455Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:30:27.9276287Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=1', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:30:27.927211] 2025-08-14T22:30:58.2171257Z 2025-08-14T22:30:58.2172704Z test_optim 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_optim_1.1_8cc0d005484d4aeb_.log 2025-08-14T22:30:58.2704324Z Running 966 items in this shard: test/test_optim.py::TestLRScheduler::test_CosineAnnealingWarmRestarts_T_cur_reset, test/test_optim.py::TestLRScheduler::test_CosineAnnealingWarmRestarts_lr1_T_mult_1, test/test_optim.py::TestLRScheduler::test_CosineAnnealingWarmRestarts_lr1_T_mult_2, test/test_optim.py::TestLRScheduler::test_CosineAnnealingWarmRestarts_lr1_T_mult_4, test/test_optim.py::TestLRScheduler::test_CosineAnnealingWarmRestarts_lr2, test/test_optim.py::TestLRScheduler::test_CosineAnnealingWarmRestarts_lr3, test/test_optim.py::TestLRScheduler::test_CosineAnnealingWarmRestarts_lr_state_dict, test/test_optim.py::TestLRScheduler::test_add_param_group_does_not_break_reduce_lr_on_plateau_min_lr_list, test/test_optim.py::TestLRScheduler::test_add_param_group_does_not_break_reduce_lr_on_plateau_min_lr_scalar, test/test_optim.py::TestLRScheduler::test_add_param_group_errors_reduce_lr_on_plateau, test/test_optim.py::TestLRScheduler::test_chained_lr1, test/test_optim.py::TestLRScheduler::test_chained_lr2, test/test_optim.py::TestLRScheduler::test_chained_lr2_get_last_lr_before_step, test/test_optim.py::TestLRScheduler::test_chained_lr3, test/test_optim.py::TestLRScheduler::test_chained_lr4, test/test_optim.py::TestLRScheduler::test_chained_lr5, test/test_optim.py::TestLRScheduler::test_closed_form_constantlr, test/test_optim.py::TestLRScheduler::test_closed_form_cos_anneal_lr, test/test_optim.py::TestLRScheduler::test_closed_form_exp_lr, test/test_optim.py::TestLRScheduler::test_closed_form_linearlr, test/test_optim.py::TestLRScheduler::test_closed_form_multi_step_lr, test/test_optim.py::TestLRScheduler::test_closed_form_poly_lr, test/test_optim.py::TestLRScheduler::test_closed_form_step_lr, test/test_optim.py::TestLRScheduler::test_compound_cosanneal_and_exp_lr, test/test_optim.py::TestLRScheduler::test_compound_cosanneal_and_linearlr, test/test_optim.py::TestLRScheduler::test_compound_cosanneal_and_multistep_lr, test/test_optim.py::TestLRScheduler::test_compound_cosanneal_and_step_lr, test/test_optim.py::TestLRScheduler::test_compound_exp_and_linearlr, test/test_optim.py::TestLRScheduler::test_compound_exp_and_multistep_lr, test/test_optim.py::TestLRScheduler::test_compound_linearlr_and_multistep_lr, test/test_optim.py::TestLRScheduler::test_compound_reduce_lr_on_plateau1, test/test_optim.py::TestLRScheduler::test_compound_reduce_lr_on_plateau2, test/test_optim.py::TestLRScheduler::test_compound_reduce_lr_on_plateau3, test/test_optim.py::TestLRScheduler::test_compound_reduce_lr_on_plateau4, test/test_optim.py::TestLRScheduler::test_compound_reduce_lr_on_plateau5, test/test_optim.py::TestLRScheduler::test_compound_step_and_constantlr, test/test_optim.py::TestLRScheduler::test_compound_step_and_exp_lr, test/test_optim.py::TestLRScheduler::test_compound_step_and_multistep_lr, test/test_optim.py::TestLRScheduler::test_constant_initial_lr_LRClass0, test/test_optim.py::TestLRScheduler::test_constant_initial_lr_LRClass1, test/test_optim.py::TestLRScheduler::test_constant_initial_lr_LRClass2, test/test_optim.py::TestLRScheduler::test_constant_initial_lr_LRClass3, test/test_optim.py::TestLRScheduler::test_constant_initial_lr_LRClass4, test/test_optim.py::TestLRScheduler::test_constant_initial_lr_LRClass5, test/test_optim.py::TestLRScheduler::test_constant_initial_lr_LRClass6, test/test_optim.py::TestLRScheduler::test_constant_initial_lr_LRClass7, test/test_optim.py::TestLRScheduler::test_constant_initial_lr_LRClass8, test/test_optim.py::TestLRScheduler::test_constant_initial_lr_LRClass9, test/test_optim.py::TestLRScheduler::test_constant_initial_params_cyclelr, test/test_optim.py::TestLRScheduler::test_constant_initial_params_onecyclelr, test/test_optim.py::TestLRScheduler::test_constant_initial_params_swalr, test/test_optim.py::TestLRScheduler::test_constantlr, test/test_optim.py::TestLRScheduler::test_constantlr_is_constant_for_constant_epoch, test/test_optim.py::TestLRScheduler::test_constantlr_with_epoch, test/test_optim.py::TestLRScheduler::test_cos_anneal_lr, test/test_optim.py::TestLRScheduler::test_cos_anneal_lr_continue, test/test_optim.py::TestLRScheduler::test_cosine_lr_state_dict, test/test_optim.py::TestLRScheduler::test_cosine_then_cyclic, test/test_optim.py::TestLRScheduler::test_cycle_lr_cycle_momentum_fail_with_momentumless_optimizer, test/test_optim.py::TestLRScheduler::test_cycle_lr_cycle_momentum_with_beta1_optimizer, test/test_optim.py::TestLRScheduler::test_cycle_lr_exp_range_mode, test/test_optim.py::TestLRScheduler::test_cycle_lr_exp_range_mode_one_lr, test/test_optim.py::TestLRScheduler::test_cycle_lr_exp_range_mode_step_size_up_down, test/test_optim.py::TestLRScheduler::test_cycle_lr_invalid_mode, test/test_optim.py::TestLRScheduler::test_cycle_lr_removed_after_out_of_scope, test/test_optim.py::TestLRScheduler::test_cycle_lr_scale_fn_restored_from_state_dict, test/test_optim.py::TestLRScheduler::test_cycle_lr_state_dict_picklable, test/test_optim.py::TestLRScheduler::test_cycle_lr_triangular2_mode, test/test_optim.py::TestLRScheduler::test_cycle_lr_triangular2_mode_one_lr, test/test_optim.py::TestLRScheduler::test_cycle_lr_triangular2_mode_step_size_up_down, test/test_optim.py::TestLRScheduler::test_cycle_lr_triangular_mode, test/test_optim.py::TestLRScheduler::test_cycle_lr_triangular_mode_one_lr, test/test_optim.py::TestLRScheduler::test_cycle_lr_triangular_mode_one_lr_no_momentum, test/test_optim.py::TestLRScheduler::test_cycle_lr_triangular_mode_step_size_up_down, test/test_optim.py::TestLRScheduler::test_cycle_lr_with_adam, test/test_optim.py::TestLRScheduler::test_cycle_lr_with_momentumless_optimizer, test/test_optim.py::TestLRScheduler::test_error_when_getlr_has_epoch, test/test_optim.py::TestLRScheduler::test_exp_lr, test/test_optim.py::TestLRScheduler::test_exp_step_lr_state_dict, test/test_optim.py::TestLRScheduler::test_exponential_lr_is_constant_for_constant_epoch, test/test_optim.py::TestLRScheduler::test_get_last_lr_constantlr, test/test_optim.py::TestLRScheduler::test_get_last_lr_linearlr, test/test_optim.py::TestLRScheduler::test_get_last_lr_multi_step_lr, test/test_optim.py::TestLRScheduler::test_get_last_lr_sequentiallr, test/test_optim.py::TestLRScheduler::test_get_last_lr_step_lr, test/test_optim.py::TestLRScheduler::test_lambda_lr, test/test_optim.py::TestLRScheduler::test_lambda_lr_state_dict_fn, test/test_optim.py::TestLRScheduler::test_lambda_lr_state_dict_obj, test/test_optim.py::TestLRScheduler::test_linear_linearlr_is_constant_for_constant_epoch, test/test_optim.py::TestLRScheduler::test_linearlr, test/test_optim.py::TestLRScheduler::test_linearlr_start_factor_limits1, test/test_optim.py::TestLRScheduler::test_linearlr_start_factor_limits2, test/test_optim.py::TestLRScheduler::test_linearlr_with_epoch, test/test_optim.py::TestLRScheduler::test_lr_scheduler_checkpoint_LRClass0, test/test_optim.py::TestLRScheduler::test_lr_scheduler_checkpoint_LRClass1, test/test_optim.py::TestLRScheduler::test_lr_scheduler_checkpoint_LRClass10, test/test_optim.py::TestLRScheduler::test_lr_scheduler_checkpoint_LRClass11, test/test_optim.py::TestLRScheduler::test_lr_scheduler_checkpoint_LRClass12, test/test_optim.py::TestLRScheduler::test_lr_scheduler_checkpoint_LRClass2, test/test_optim.py::TestLRScheduler::test_lr_scheduler_checkpoint_LRClass3, test/test_optim.py::TestLRScheduler::test_lr_scheduler_checkpoint_LRClass4, test/test_optim.py::TestLRScheduler::test_lr_scheduler_checkpoint_LRClass5, test/test_optim.py::TestLRScheduler::test_lr_scheduler_checkpoint_LRClass6, test/test_optim.py::TestLRScheduler::test_lr_scheduler_checkpoint_LRClass7, test/test_optim.py::TestLRScheduler::test_lr_scheduler_checkpoint_LRClass8, test/test_optim.py::TestLRScheduler::test_lr_scheduler_checkpoint_LRClass9, test/test_optim.py::TestLRScheduler::test_lr_scheduler_checkpoint_on_plateau, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass0_weights_only_False, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass0_weights_only_True, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass10_weights_only_False, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass10_weights_only_True, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass11_weights_only_False, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass11_weights_only_True, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass12_weights_only_False, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass12_weights_only_True, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass13_weights_only_False, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass13_weights_only_True, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass14_weights_only_False, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass14_weights_only_True, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass1_weights_only_False, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass1_weights_only_True, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass2_weights_only_False, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass2_weights_only_True, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass3_weights_only_False, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass3_weights_only_True, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass4_weights_only_False, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass4_weights_only_True, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass5_weights_only_False, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass5_weights_only_True, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass6_weights_only_False, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass6_weights_only_True, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass7_weights_only_False, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass7_weights_only_True, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass8_weights_only_False, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass8_weights_only_True, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass9_weights_only_False, test/test_optim.py::TestLRScheduler::test_lr_scheduler_state_dict_load_LRClass9_weights_only_True, test/test_optim.py::TestLRScheduler::test_multi_step_lr, test/test_optim.py::TestLRScheduler::test_multi_step_lr_state_dict, test/test_optim.py::TestLRScheduler::test_multi_step_lr_with_epoch, test/test_optim.py::TestLRScheduler::test_multiplicative_lr, test/test_optim.py::TestLRScheduler::test_multiplicative_lr_with_lr_lambda, test/test_optim.py::TestLRScheduler::test_new_pattern_no_warning, test/test_optim.py::TestLRScheduler::test_new_pattern_no_warning_with_arg, test/test_optim.py::TestLRScheduler::test_new_pattern_no_warning_with_overridden_optim_step, test/test_optim.py::TestLRScheduler::test_no_cyclic_references, test/test_optim.py::TestLRScheduler::test_no_cyclic_references_in_step, test/test_optim.py::TestLRScheduler::test_old_pattern_warning, test/test_optim.py::TestLRScheduler::test_old_pattern_warning_resuming, test/test_optim.py::TestLRScheduler::test_old_pattern_warning_resuming_with_arg, test/test_optim.py::TestLRScheduler::test_old_pattern_warning_with_arg, test/test_optim.py::TestLRScheduler::test_old_pattern_warning_with_overridden_optim_step, test/test_optim.py::TestLRScheduler::test_onecycle_lr_cannot_calculate_total_steps, test/test_optim.py::TestLRScheduler::test_onecycle_lr_cosine_annealing, test/test_optim.py::TestLRScheduler::test_onecycle_lr_invalid_anneal_strategy, test/test_optim.py::TestLRScheduler::test_onecycle_lr_invalid_pct_start, test/test_optim.py::TestLRScheduler::test_onecycle_lr_legacy_state_dict, test/test_optim.py::TestLRScheduler::test_onecycle_lr_linear_annealing, test/test_optim.py::TestLRScheduler::test_onecycle_lr_linear_annealing_three_phases, test/test_optim.py::TestLRScheduler::test_poly_lr, test/test_optim.py::TestLRScheduler::test_polynomial_lr_is_constant_for_constant_epoch, test/test_optim.py::TestLRScheduler::test_reduce_lr_on_plateau1, test/test_optim.py::TestLRScheduler::test_reduce_lr_on_plateau2, test/test_optim.py::TestLRScheduler::test_reduce_lr_on_plateau3, test/test_optim.py::TestLRScheduler::test_reduce_lr_on_plateau4, test/test_optim.py::TestLRScheduler::test_reduce_lr_on_plateau5, test/test_optim.py::TestLRScheduler::test_reduce_lr_on_plateau6, test/test_optim.py::TestLRScheduler::test_reduce_lr_on_plateau7, test/test_optim.py::TestLRScheduler::test_reduce_lr_on_plateau8, test/test_optim.py::TestLRScheduler::test_reduce_lr_on_plateau_get_last_lr_before_step, test/test_optim.py::TestLRScheduler::test_reduce_lr_on_plateau_state_dict, test/test_optim.py::TestLRScheduler::test_sequentiallr1, test/test_optim.py::TestLRScheduler::test_sequentiallr2, test/test_optim.py::TestLRScheduler::test_sequentiallr3, test/test_optim.py::TestLRScheduler::test_sequentiallr4, test/test_optim.py::TestLRScheduler::test_sequentiallr5, test/test_optim.py::TestLRScheduler::test_step_lr, test/test_optim.py::TestLRScheduler::test_step_lr_is_constant_for_constant_epoch, test/test_optim.py::TestLRScheduler::test_step_lr_state_dict, test/test_optim.py::TestLRScheduler::test_swa_lr_state_dict, test/test_optim.py::TestLRScheduler::test_swalr_cosine_anneal_after_multiplicative, test/test_optim.py::TestLRScheduler::test_swalr_hypers, test/test_optim.py::TestLRScheduler::test_swalr_linear_anneal_after_multiplicative, test/test_optim.py::TestLRScheduler::test_swalr_no_anneal, test/test_optim.py::TestDifferentiableOptimizer::test_adadelta, test/test_optim.py::TestDifferentiableOptimizer::test_adagrad, test/test_optim.py::TestDifferentiableOptimizer::test_adam, test/test_optim.py::TestDifferentiableOptimizer::test_adam_differentiable_all_hyperparams, test/test_optim.py::TestDifferentiableOptimizer::test_adam_differentiable_betas, test/test_optim.py::TestDifferentiableOptimizer::test_adam_differentiable_lr, test/test_optim.py::TestDifferentiableOptimizer::test_adam_differentiable_weight_decay, test/test_optim.py::TestDifferentiableOptimizer::test_adamax, test/test_optim.py::TestDifferentiableOptimizer::test_adamw, test/test_optim.py::TestDifferentiableOptimizer::test_adamw_differentiable_all_hyperparams, test/test_optim.py::TestDifferentiableOptimizer::test_adamw_differentiable_betas, test/test_optim.py::TestDifferentiableOptimizer::test_adamw_differentiable_lr, test/test_optim.py::TestDifferentiableOptimizer::test_adamw_differentiable_weight_decay, test/test_optim.py::TestDifferentiableOptimizer::test_asgd, test/test_optim.py::TestDifferentiableOptimizer::test_differentiable_lr, test/test_optim.py::TestDifferentiableOptimizer::test_differentiable_weight_decay, test/test_optim.py::TestDifferentiableOptimizer::test_differentiable_weight_decay_and_lr, test/test_optim.py::TestDifferentiableOptimizer::test_nadam, test/test_optim.py::TestDifferentiableOptimizer::test_radam, test/test_optim.py::TestDifferentiableOptimizer::test_rmsprop, test/test_optim.py::TestDifferentiableOptimizer::test_rprop, test/test_optim.py::TestDifferentiableOptimizer::test_sgd, test/test_optim.py::TestSWAUtils::test_averaged_model_all_devices_ema_False, test/test_optim.py::TestSWAUtils::test_averaged_model_all_devices_ema_True, test/test_optim.py::TestSWAUtils::test_averaged_model_default_avg_fn_picklable, test/test_optim.py::TestSWAUtils::test_averaged_model_exponential_use_multi_avg_fn_False_use_buffers_False, test/test_optim.py::TestSWAUtils::test_averaged_model_exponential_use_multi_avg_fn_False_use_buffers_True, test/test_optim.py::TestSWAUtils::test_averaged_model_exponential_use_multi_avg_fn_True_use_buffers_False, test/test_optim.py::TestSWAUtils::test_averaged_model_exponential_use_multi_avg_fn_True_use_buffers_True, test/test_optim.py::TestSWAUtils::test_averaged_model_mixed_device_ema_False, test/test_optim.py::TestSWAUtils::test_averaged_model_mixed_device_ema_True, test/test_optim.py::TestSWAUtils::test_averaged_model_state_dict, test/test_optim.py::TestSWAUtils::test_bn_update_eval_momentum, test/test_optim.py::TestSWAUtils::test_update_bn_cnn, test/test_optim.py::TestSWAUtils::test_update_bn_dnn, test/test_optim.py::TestOptimRenewedCUDA::test_adamw_serialization_cuda, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_False_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_False_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_False_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_False_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_False_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_False_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_False_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_False_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_False_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_False_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_False_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_False_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_False_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_True_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_True_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_True_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_True_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_True_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_True_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_True_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_True_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_True_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_True_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_True_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_True_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_False_is_named_optim1_True_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_False_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_False_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_False_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_False_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_False_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_False_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_False_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_False_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_False_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_False_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_False_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_False_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_False_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_True_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_True_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_True_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_True_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_True_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_True_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_True_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_True_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_True_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_True_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_True_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_True_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_from_to_named_state_dict_is_named_optim0_True_is_named_optim1_True_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_older_state_dict_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_older_state_dict_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_older_state_dict_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_older_state_dict_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_older_state_dict_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_older_state_dict_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_older_state_dict_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_older_state_dict_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_older_state_dict_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_older_state_dict_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_older_state_dict_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_older_state_dict_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_older_state_dict_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_can_load_older_state_dict_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_complex_2d_ASGD_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_complex_2d_Adadelta_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_complex_2d_Adagrad_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_complex_2d_AdamW_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_complex_2d_Adam_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_complex_2d_Adamax_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_complex_2d_LBFGS_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_complex_2d_NAdam_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_complex_2d_RAdam_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_complex_2d_RMSprop_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_complex_2d_Rprop_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_complex_2d_SGD_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_complex_ASGD_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_complex_Adadelta_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_complex_Adagrad_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_complex_AdamW_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_complex_Adam_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_complex_Adamax_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_complex_LBFGS_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_complex_NAdam_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_complex_RAdam_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_complex_RMSprop_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_complex_Rprop_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_complex_SGD_cuda_complex64, test/test_optim.py::TestOptimRenewedCUDA::test_cpu_load_state_dict_impl_capturable_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_cpu_load_state_dict_impl_capturable_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_cpu_load_state_dict_impl_capturable_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_cpu_load_state_dict_impl_capturable_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_cpu_load_state_dict_impl_fused_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_cpu_load_state_dict_impl_fused_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_cpu_load_state_dict_impl_fused_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_cpu_load_state_dict_impl_fused_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_deepcopy_copies_all_public_attrs_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_deepcopy_copies_all_public_attrs_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_deepcopy_copies_all_public_attrs_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_deepcopy_copies_all_public_attrs_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_deepcopy_copies_all_public_attrs_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_deepcopy_copies_all_public_attrs_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_deepcopy_copies_all_public_attrs_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_deepcopy_copies_all_public_attrs_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_deepcopy_copies_all_public_attrs_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_deepcopy_copies_all_public_attrs_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_deepcopy_copies_all_public_attrs_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_deepcopy_copies_all_public_attrs_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_deepcopy_copies_all_public_attrs_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_deepcopy_copies_all_public_attrs_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_defaults_changed_to_foreach_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_defaults_changed_to_foreach_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_defaults_changed_to_foreach_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_defaults_changed_to_foreach_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_defaults_changed_to_foreach_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_defaults_changed_to_foreach_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_defaults_changed_to_foreach_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_defaults_changed_to_foreach_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_defaults_changed_to_foreach_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_defaults_changed_to_foreach_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_defaults_changed_to_foreach_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_errors_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_errors_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_errors_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_errors_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_errors_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_errors_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_errors_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_errors_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_errors_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_errors_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_errors_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_errors_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_errors_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_errors_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_large_tensor_ASGD_cuda_float16, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_large_tensor_Adadelta_cuda_float16, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_large_tensor_Adafactor_cuda_float16, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_large_tensor_Adagrad_cuda_float16, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_large_tensor_AdamW_cuda_float16, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_large_tensor_Adam_cuda_float16, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_large_tensor_Adamax_cuda_float16, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_large_tensor_NAdam_cuda_float16, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_large_tensor_RAdam_cuda_float16, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_large_tensor_RMSprop_cuda_float16, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_large_tensor_Rprop_cuda_float16, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_large_tensor_SGD_cuda_float16, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_matches_forloop_ASGD_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_matches_forloop_Adadelta_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_matches_forloop_Adafactor_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_matches_forloop_Adagrad_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_matches_forloop_AdamW_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_matches_forloop_Adam_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_matches_forloop_Adamax_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_matches_forloop_NAdam_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_matches_forloop_RAdam_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_matches_forloop_RMSprop_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_matches_forloop_Rprop_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_foreach_matches_forloop_SGD_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_False_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_False_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_False_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_False_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_False_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_False_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_False_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_False_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_False_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_False_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_False_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_False_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_False_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_False_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_True_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_True_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_True_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_True_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_True_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_True_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_True_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_True_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_True_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_True_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_True_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_True_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_True_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_False_with_lrsched_True_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_False_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_False_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_False_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_False_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_False_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_False_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_False_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_False_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_False_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_False_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_False_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_False_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_False_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_False_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_True_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_True_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_True_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_True_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_True_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_True_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_True_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_True_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_True_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_True_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_True_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_True_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_True_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_contiguous_True_with_lrsched_True_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_False_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_False_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_False_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_False_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_False_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_False_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_False_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_False_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_False_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_False_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_False_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_False_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_False_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_False_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_True_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_True_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_True_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_True_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_True_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_True_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_True_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_True_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_True_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_True_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_True_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_True_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_True_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_forloop_goes_right_direction_multigpu_with_lrsched_True_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_fused_cpu_matches_cuda_AdamW_cuda_bfloat16, test/test_optim.py::TestOptimRenewedCUDA::test_fused_cpu_matches_cuda_AdamW_cuda_float16, test/test_optim.py::TestOptimRenewedCUDA::test_fused_cpu_matches_cuda_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_fused_cpu_matches_cuda_AdamW_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_fused_cpu_matches_cuda_Adam_cuda_bfloat16, test/test_optim.py::TestOptimRenewedCUDA::test_fused_cpu_matches_cuda_Adam_cuda_float16, test/test_optim.py::TestOptimRenewedCUDA::test_fused_cpu_matches_cuda_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_fused_cpu_matches_cuda_Adam_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_fused_cpu_matches_cuda_SGD_cuda_bfloat16, test/test_optim.py::TestOptimRenewedCUDA::test_fused_cpu_matches_cuda_SGD_cuda_float16, test/test_optim.py::TestOptimRenewedCUDA::test_fused_cpu_matches_cuda_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_fused_cpu_matches_cuda_SGD_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_fused_does_not_step_if_foundinf_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_fused_does_not_step_if_foundinf_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_fused_does_not_step_if_foundinf_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_fused_does_not_step_if_foundinf_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_fused_error_on_params_on_meta_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_fused_error_on_params_on_meta_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_fused_error_on_params_on_meta_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_fused_error_on_params_on_meta_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_fused_large_tensor_Adagrad_cuda_float16, test/test_optim.py::TestOptimRenewedCUDA::test_fused_large_tensor_AdamW_cuda_float16, test/test_optim.py::TestOptimRenewedCUDA::test_fused_large_tensor_Adam_cuda_float16, test/test_optim.py::TestOptimRenewedCUDA::test_fused_large_tensor_SGD_cuda_float16, test/test_optim.py::TestOptimRenewedCUDA::test_fused_matches_forloop_Adagrad_cuda_bfloat16, test/test_optim.py::TestOptimRenewedCUDA::test_fused_matches_forloop_Adagrad_cuda_float16, test/test_optim.py::TestOptimRenewedCUDA::test_fused_matches_forloop_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_fused_matches_forloop_Adagrad_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_fused_matches_forloop_AdamW_cuda_bfloat16, test/test_optim.py::TestOptimRenewedCUDA::test_fused_matches_forloop_AdamW_cuda_float16, test/test_optim.py::TestOptimRenewedCUDA::test_fused_matches_forloop_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_fused_matches_forloop_AdamW_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_fused_matches_forloop_Adam_cuda_bfloat16, test/test_optim.py::TestOptimRenewedCUDA::test_fused_matches_forloop_Adam_cuda_float16, test/test_optim.py::TestOptimRenewedCUDA::test_fused_matches_forloop_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_fused_matches_forloop_Adam_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_fused_matches_forloop_SGD_cuda_bfloat16, test/test_optim.py::TestOptimRenewedCUDA::test_fused_matches_forloop_SGD_cuda_float16, test/test_optim.py::TestOptimRenewedCUDA::test_fused_matches_forloop_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_fused_matches_forloop_SGD_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_grads_are_never_inplaced_into_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_grads_are_never_inplaced_into_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_grads_are_never_inplaced_into_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_grads_are_never_inplaced_into_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_grads_are_never_inplaced_into_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_grads_are_never_inplaced_into_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_grads_are_never_inplaced_into_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_grads_are_never_inplaced_into_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_grads_are_never_inplaced_into_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_grads_are_never_inplaced_into_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_grads_are_never_inplaced_into_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_grads_are_never_inplaced_into_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_grads_are_never_inplaced_into_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_grads_are_never_inplaced_into_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_nontensor_step_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_nontensor_step_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_nontensor_step_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_nontensor_step_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_nontensor_step_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_nontensor_step_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_nontensor_step_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_nontensor_step_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_nontensor_step_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_nontensor_step_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_nontensor_step_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_nontensor_step_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_nontensor_step_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_nontensor_step_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_post_hook_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_post_hook_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_post_hook_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_post_hook_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_post_hook_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_post_hook_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_post_hook_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_post_hook_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_post_hook_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_post_hook_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_post_hook_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_post_hook_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_post_hook_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_post_hook_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_hook_and_prepend_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_hook_and_prepend_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_hook_and_prepend_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_hook_and_prepend_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_hook_and_prepend_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_hook_and_prepend_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_hook_and_prepend_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_hook_and_prepend_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_hook_and_prepend_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_hook_and_prepend_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_hook_and_prepend_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_hook_and_prepend_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_hook_and_prepend_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_hook_and_prepend_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_post_hook_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_post_hook_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_post_hook_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_post_hook_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_post_hook_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_post_hook_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_post_hook_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_post_hook_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_post_hook_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_post_hook_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_post_hook_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_post_hook_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_post_hook_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_load_state_dict_pre_post_hook_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_foreach_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_foreach_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_foreach_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_foreach_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_foreach_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_foreach_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_foreach_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_foreach_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_foreach_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_foreach_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_foreach_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_foreach_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_fused_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_fused_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_fused_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_fused_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_fused_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_fused_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_fused_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_fused_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_fused_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_fused_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_fused_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_mixed_device_dtype_impl_fused_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_non_empty_state_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_non_empty_state_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_non_empty_state_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_non_empty_state_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_non_empty_state_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_non_empty_state_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_non_empty_state_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_non_empty_state_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_non_empty_state_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_non_empty_state_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_non_empty_state_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_non_empty_state_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_non_empty_state_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_non_empty_state_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optim_infos_do_not_specify_global_cliquey_kwargs_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optim_infos_do_not_specify_global_cliquey_kwargs_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optim_infos_do_not_specify_global_cliquey_kwargs_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optim_infos_do_not_specify_global_cliquey_kwargs_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optim_infos_do_not_specify_global_cliquey_kwargs_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optim_infos_do_not_specify_global_cliquey_kwargs_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optim_infos_do_not_specify_global_cliquey_kwargs_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optim_infos_do_not_specify_global_cliquey_kwargs_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optim_infos_do_not_specify_global_cliquey_kwargs_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optim_infos_do_not_specify_global_cliquey_kwargs_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optim_infos_do_not_specify_global_cliquey_kwargs_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optim_infos_do_not_specify_global_cliquey_kwargs_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optim_infos_do_not_specify_global_cliquey_kwargs_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optim_infos_do_not_specify_global_cliquey_kwargs_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optimizer_can_be_printed_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optimizer_can_be_printed_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optimizer_can_be_printed_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optimizer_can_be_printed_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optimizer_can_be_printed_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optimizer_can_be_printed_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optimizer_can_be_printed_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optimizer_can_be_printed_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optimizer_can_be_printed_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optimizer_can_be_printed_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optimizer_can_be_printed_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optimizer_can_be_printed_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optimizer_can_be_printed_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_optimizer_can_be_printed_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_group_with_lrscheduler_goes_right_direction_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_group_with_lrscheduler_goes_right_direction_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_group_with_lrscheduler_goes_right_direction_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_group_with_lrscheduler_goes_right_direction_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_group_with_lrscheduler_goes_right_direction_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_group_with_lrscheduler_goes_right_direction_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_group_with_lrscheduler_goes_right_direction_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_group_with_lrscheduler_goes_right_direction_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_group_with_lrscheduler_goes_right_direction_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_group_with_lrscheduler_goes_right_direction_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_group_with_lrscheduler_goes_right_direction_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_group_with_lrscheduler_goes_right_direction_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_group_with_lrscheduler_goes_right_direction_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_group_with_lrscheduler_goes_right_direction_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_lr_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_lr_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_lr_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_lr_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_lr_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_lr_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_lr_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_lr_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_lr_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_lr_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_lr_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_lr_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_lr_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_lr_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_weight_decay_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_weight_decay_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_weight_decay_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_weight_decay_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_weight_decay_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_weight_decay_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_weight_decay_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_weight_decay_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_weight_decay_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_weight_decay_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_weight_decay_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_weight_decay_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_weight_decay_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_param_groups_weight_decay_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_peak_memory_foreach_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_peak_memory_foreach_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_peak_memory_foreach_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_peak_memory_foreach_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_peak_memory_foreach_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_peak_memory_foreach_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_peak_memory_foreach_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_peak_memory_foreach_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_peak_memory_foreach_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_peak_memory_foreach_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_peak_memory_foreach_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_peak_memory_foreach_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_rosenbrock_sparse_with_lrsched_False_Adagrad_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_rosenbrock_sparse_with_lrsched_False_SGD_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_rosenbrock_sparse_with_lrsched_False_SparseAdam_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_rosenbrock_sparse_with_lrsched_True_Adagrad_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_rosenbrock_sparse_with_lrsched_True_SGD_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_rosenbrock_sparse_with_lrsched_True_SparseAdam_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_False_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_False_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_False_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_False_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_False_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_False_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_False_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_False_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_False_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_False_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_False_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_False_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_False_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_False_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_True_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_True_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_True_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_True_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_True_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_True_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_True_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_True_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_True_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_True_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_True_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_True_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_True_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_save_load_equality_with_weights_only_is_named_optim_True_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_second_order_optims_return_consistent_types_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_set_default_dtype_works_with_foreach_ASGD_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_set_default_dtype_works_with_foreach_Adadelta_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_set_default_dtype_works_with_foreach_Adafactor_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_set_default_dtype_works_with_foreach_Adagrad_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_set_default_dtype_works_with_foreach_AdamW_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_set_default_dtype_works_with_foreach_Adam_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_set_default_dtype_works_with_foreach_Adamax_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_set_default_dtype_works_with_foreach_NAdam_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_set_default_dtype_works_with_foreach_RAdam_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_set_default_dtype_works_with_foreach_RMSprop_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_set_default_dtype_works_with_foreach_Rprop_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_set_default_dtype_works_with_foreach_SGD_cuda_float64, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_False_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_False_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_False_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_False_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_False_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_False_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_False_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_False_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_False_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_False_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_False_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_False_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_False_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_False_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_True_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_True_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_True_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_True_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_True_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_True_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_True_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_True_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_True_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_True_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_True_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_True_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_True_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_False_is_named_optim1_True_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_False_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_False_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_False_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_False_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_False_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_False_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_False_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_False_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_False_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_False_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_False_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_False_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_False_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_False_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_True_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_True_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_True_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_True_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_True_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_True_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_True_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_True_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_True_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_True_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_True_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_True_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_True_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_deterministic_is_named_optim0_True_is_named_optim1_True_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_post_hook_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_post_hook_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_post_hook_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_post_hook_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_post_hook_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_post_hook_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_post_hook_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_post_hook_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_post_hook_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_post_hook_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_post_hook_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_post_hook_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_post_hook_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_post_hook_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_hook_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_hook_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_hook_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_hook_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_hook_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_hook_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_hook_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_hook_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_hook_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_hook_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_hook_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_hook_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_hook_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_hook_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_post_hook_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_post_hook_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_post_hook_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_post_hook_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_post_hook_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_post_hook_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_post_hook_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_post_hook_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_post_hook_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_post_hook_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_post_hook_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_post_hook_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_post_hook_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_pre_post_hook_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_with_cuda_params_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_with_cuda_params_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_with_cuda_params_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_with_cuda_params_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_with_cuda_params_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_with_cuda_params_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_with_cuda_params_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_with_cuda_params_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_with_cuda_params_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_with_cuda_params_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_with_cuda_params_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_with_cuda_params_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_with_cuda_params_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_state_dict_with_cuda_params_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_all_hooks_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_all_hooks_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_all_hooks_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_all_hooks_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_all_hooks_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_all_hooks_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_all_hooks_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_all_hooks_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_all_hooks_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_all_hooks_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_all_hooks_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_all_hooks_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_all_hooks_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_all_hooks_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_for_zero_grads_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_for_zero_grads_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_for_zero_grads_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_for_zero_grads_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_for_zero_grads_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_for_zero_grads_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_for_zero_grads_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_for_zero_grads_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_for_zero_grads_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_for_zero_grads_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_for_zero_grads_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_for_zero_grads_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_for_zero_grads_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_for_zero_grads_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_when_params_have_no_grad_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_when_params_have_no_grad_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_when_params_have_no_grad_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_when_params_have_no_grad_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_when_params_have_no_grad_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_when_params_have_no_grad_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_when_params_have_no_grad_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_when_params_have_no_grad_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_when_params_have_no_grad_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_when_params_have_no_grad_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_when_params_have_no_grad_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_when_params_have_no_grad_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_when_params_have_no_grad_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_is_noop_when_params_have_no_grad_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_post_hook_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_post_hook_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_post_hook_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_post_hook_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_post_hook_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_post_hook_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_post_hook_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_post_hook_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_post_hook_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_post_hook_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_post_hook_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_post_hook_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_post_hook_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_post_hook_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_pre_hook_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_pre_hook_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_pre_hook_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_pre_hook_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_pre_hook_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_pre_hook_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_pre_hook_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_pre_hook_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_pre_hook_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_pre_hook_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_pre_hook_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_pre_hook_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_pre_hook_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_step_pre_hook_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_0_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_0_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_0_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_0_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_0_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_0_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_0_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_0_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_0_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_0_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_0_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_0_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_0_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_0_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_1_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_1_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_1_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_1_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_1_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_1_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_1_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_1_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_1_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_1_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_1_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_1_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_1_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_1_SparseAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_2_ASGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_2_Adadelta_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_2_Adafactor_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_2_Adagrad_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_2_AdamW_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_2_Adam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_2_Adamax_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_2_LBFGS_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_2_NAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_2_RAdam_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_2_RMSprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_2_Rprop_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_2_SGD_cuda_float32, test/test_optim.py::TestOptimRenewedCUDA::test_tensor_lr_num_dim_2_SparseAdam_cuda_float32 2025-08-14T22:30:58.3075993Z 2025-08-14T22:31:02.3213873Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:31:02.3216300Z import pkg_resources 2025-08-14T22:31:02.6975149Z Running test_decomp 5/16 ... [2025-08-14 22:31:02.696876] 2025-08-14T22:31:02.6975548Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:31:02.6976504Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=5', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:31:02.697252] 2025-08-14T22:32:45.8832175Z 2025-08-14T22:32:45.8833068Z test_meta 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_meta_1.1_09f4c36b551bfcb4_.log 2025-08-14T22:32:47.1206116Z Running 40667 items in this shard: test/test_meta.py::TestMetaConverter::test_channels_last, test/test_meta.py::TestMetaConverter::test_channels_last_leaf, test/test_meta.py::TestMetaConverter::test_channels_last_non_leaf, test/test_meta.py::TestMetaConverter::test_complex_noncontiguous_bug, test/test_meta.py::TestMetaConverter::test_empty_strided_non_dense_leaf, test/test_meta.py::TestMetaConverter::test_imag, test/test_meta.py::TestMetaConverter::test_inplace_set_storage, test/test_meta.py::TestMetaConverter::test_leaf, test/test_meta.py::TestMetaConverter::test_non_leaf, test/test_meta.py::TestMetaConverter::test_non_leaf_torture, test/test_meta.py::TestMetaConverter::test_requires_grad_false, test/test_meta.py::TestMetaConverter::test_tensor_outlives_converter, test/test_meta.py::TestMetaConverter::test_view_as_complex, test/test_meta.py::TestMetaConverter::test_view_as_real, test/test_meta.py::TestMetaConverter::test_view_dtype, test/test_meta.py::TestMetaConverter::test_view_mutate, test/test_meta.py::TestMetaConverter::test_view_of_leaf, test/test_meta.py::TestMetaConverter::test_view_of_non_leaf, test/test_meta.py::TestMetaConverter::test_view_of_view_of_leaf, test/test_meta.py::TestMetaConverter::test_weakref, test/test_meta.py::TestMetaCUDA::test_batch_norm_backward_output_mask0_cuda, test/test_meta.py::TestMetaCUDA::test_batch_norm_backward_output_mask1_cuda, test/test_meta.py::TestMetaCUDA::test_batch_norm_backward_output_mask2_cuda, test/test_meta.py::TestMetaCUDA::test_batch_norm_backward_output_mask3_cuda, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype___radd___cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype___rdiv___cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype___rmod___cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype___rmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype___rpow___cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype___rsub___cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs__conversions_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs__conversions_polar_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_atan2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_copysign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_div_floor_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_div_no_rounding_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_div_trunc_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_eq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_float_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_floor_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_fmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_fmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_fmod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_ge_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_gt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_heaviside_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_hypot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_igamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_igammac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_isclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_le_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_logical_and_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_logical_or_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_logical_xor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_lt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_ne_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_nextafter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_remainder_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_rsub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_special_xlog1py_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_special_zeta_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_true_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype__refs_xlogy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_atan2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_copysign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_div_floor_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_div_no_rounding_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_div_trunc_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_eq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_float_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_floor_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_fmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_fmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_fmod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_ge_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_gt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_heaviside_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_hypot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_igamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_igammac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_isclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_jiterator_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_jiterator_binary_return_by_ref_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_ldexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_le_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_logical_and_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_logical_or_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_logical_xor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_lt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_max_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_min_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_ne_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_nextafter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_polar_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_remainder_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_rsub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_hermite_polynomial_h_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_hermite_polynomial_he_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_laguerre_polynomial_l_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_legendre_polynomial_p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_xlog1py_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_special_zeta_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_true_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_binary_ufuncs_mixed_dtype_xlogy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_cdist_forward_cuda, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_H_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_T_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___getitem___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___radd___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rand___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rand___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rand___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rand___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rand___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rand___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rdiv___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmatmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmatmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmatmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmatmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmatmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmatmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmod___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmod___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmod___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmod___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmod___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmod___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmod___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmod___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmod___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rmul___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___ror___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___ror___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___ror___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___ror___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___ror___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___ror___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rpow___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rpow___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rpow___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rpow___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rpow___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rpow___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rpow___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rpow___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rpow___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rpow___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rpow___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rsub___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rsub___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rsub___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rsub___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rsub___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rsub___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rsub___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rsub___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rsub___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rsub___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rsub___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rxor___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rxor___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rxor___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rxor___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rxor___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace___rxor___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__batch_norm_with_update_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__batch_norm_with_update_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__batch_norm_with_update_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__batch_norm_with_update_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__chunk_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcdiv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_div_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_frac_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lerp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_norm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__foreach_zero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__native_batch_norm_legit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__native_batch_norm_legit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__native_batch_norm_legit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__native_batch_norm_legit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__segment_reduce_lengths_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__segment_reduce_lengths_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__segment_reduce_lengths_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__segment_reduce_lengths_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__segment_reduce_offsets_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__segment_reduce_offsets_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__segment_reduce_offsets_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__segment_reduce_offsets_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__softmax_backward_data_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__softmax_backward_data_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__softmax_backward_data_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__softmax_backward_data_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__unsafe_masked_index_put_accumulate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__upsample_bilinear2d_aa_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__upsample_bilinear2d_aa_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__upsample_bilinear2d_aa_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace__upsample_bilinear2d_aa_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_acosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_decomposed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_decomposed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_decomposed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_decomposed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_decomposed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmm_decomposed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addmv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_addr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_alias_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_all_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_allclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_allclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_allclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_allclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_allclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_allclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_aminmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_aminmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_aminmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_aminmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_aminmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_aminmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_aminmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_aminmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_aminmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_aminmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_angle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_angle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_angle_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_angle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_angle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_angle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_angle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_angle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_angle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_angle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_angle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_any_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_arange_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_arange_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_arange_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_arange_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_arange_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_arange_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_arange_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_arange_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_arange_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argsort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argsort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argsort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argsort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argsort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argsort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argsort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argsort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argsort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argsort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_argwhere_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_partial_views_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_as_strided_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_asinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_1d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_2d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_atleast_3d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_baddbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_baddbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_baddbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_baddbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_baddbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_baddbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bernoulli_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bernoulli_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bernoulli_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bernoulli_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bfloat16_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bincount_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bincount_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bincount_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bincount_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bincount_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_left_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_left_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_left_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_left_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_left_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_right_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_right_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_right_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_right_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_right_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bitwise_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_block_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bool_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_shapes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_broadcast_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bucketize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bucketize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bucketize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bucketize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bucketize_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bucketize_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bucketize_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bucketize_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_bucketize_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_byte_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cartesian_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cauchy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cauchy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cauchy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cauchy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cdouble_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cfloat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chalf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_char_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_inverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_inverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_inverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_inverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cholesky_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_clone_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_column_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_combinations_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_conj_physical_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_constant_pad_nd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_contiguous_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_copysign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_copysign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_copysign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_copysign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_copysign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_copysign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_copysign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_copysign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_copysign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_copysign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_corrcoef_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_corrcoef_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_corrcoef_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_corrcoef_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_corrcoef_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_corrcoef_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_corrcoef_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_corrcoef_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_corrcoef_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_corrcoef_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_corrcoef_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_count_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cov_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cov_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cov_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cov_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cov_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cov_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cov_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cov_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cov_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cov_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cov_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cummin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumulative_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumulative_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumulative_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumulative_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumulative_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumulative_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumulative_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumulative_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumulative_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumulative_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_cumulative_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_deg2rad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_deg2rad_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_deg2rad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_deg2rad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_deg2rad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_deg2rad_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_deg2rad_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_deg2rad_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_deg2rad_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_deg2rad_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diag_embed_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagflat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diagonal_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_diff_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_digamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_digamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_digamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_digamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_digamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_digamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_digamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_digamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_digamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_digamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dist_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dist_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dist_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dist_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_floor_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_floor_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_floor_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_floor_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_floor_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_floor_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_floor_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_floor_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_floor_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_no_rounding_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_trunc_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_trunc_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_trunc_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_trunc_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_trunc_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_trunc_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_trunc_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_trunc_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_div_trunc_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_double_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_dstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_einsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_einsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_einsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_einsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_einsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_einsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_permuted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eq_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_equal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfinv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfinv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfinv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfinv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfinv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfinv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfinv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_erfinv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expand_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exponential_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exponential_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_float8_e4m3fnuz, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_float8_e5m2, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_float8_e5m2fnuz, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_eye_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_fftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_hfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ifftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_ihfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_irfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fft_rfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flip_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fliplr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_flipud_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_float_power_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_floor_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_fmod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_frexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_frexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_frexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_frexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_full_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gather_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gcd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gcd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gcd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gcd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gcd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ge_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ge_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ge_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ge_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ge_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ge_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ge_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ge_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ge_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ge_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geometric_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geometric_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geometric_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geometric_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geometric_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geometric_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geometric_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geometric_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geometric_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geqrf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geqrf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geqrf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_geqrf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gradient_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gradient_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gradient_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gradient_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gradient_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gradient_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gradient_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gradient_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gradient_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gradient_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_grid_sampler_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_grid_sampler_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_grid_sampler_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_grid_sampler_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_gt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_half_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hash_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hash_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hash_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hash_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hash_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hash_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hash_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hash_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hash_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hash_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_heaviside_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_heaviside_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_heaviside_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_heaviside_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_heaviside_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_heaviside_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_heaviside_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_heaviside_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_heaviside_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_heaviside_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_histc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_histc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_histc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_histc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_histc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_histc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_histc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hypot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hypot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hypot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_hypot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_i0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_i0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_igamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_igamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_igammac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_igammac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_imag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_imag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_imag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_index_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_inner_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_inner_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_inner_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_inner_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_inner_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_inner_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_int_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isclose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isfinite_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isnan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isneginf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isneginf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isneginf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isneginf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isneginf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isneginf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isneginf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isneginf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isneginf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isneginf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isposinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isposinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isposinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isposinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isposinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isposinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isposinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isposinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isposinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isposinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_isreal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_istft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_istft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_item_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_2inputs_2outputs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_4inputs_with_extra_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_binary_return_by_ref_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_jiterator_unary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kron_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kthvalue_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kthvalue_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kthvalue_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kthvalue_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kthvalue_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kthvalue_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kthvalue_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kthvalue_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_kthvalue_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lcm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lcm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lcm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lcm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lcm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ldexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_le_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_le_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_le_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_le_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_le_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_le_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_le_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_le_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_le_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_le_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lerp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cholesky_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cholesky_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cholesky_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cholesky_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cond_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cond_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cond_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cond_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_det_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_det_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_det_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_det_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eig_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eig_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eig_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eig_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigvalsh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigvalsh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigvalsh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_eigvalsh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_householder_product_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_householder_product_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_householder_product_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_householder_product_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_inv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_inv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_inv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_inv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_inv_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_inv_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_inv_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_inv_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_ldl_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lstsq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lstsq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lstsq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lstsq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lstsq_grad_oriented_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lstsq_grad_oriented_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lstsq_grad_oriented_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lstsq_grad_oriented_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_rank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_rank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_rank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_rank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_rank_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_rank_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_rank_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_matrix_rank_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_multi_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_multi_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_multi_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_multi_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_multi_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_multi_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_subgradients_at_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_subgradients_at_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_subgradients_at_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_norm_subgradients_at_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_singular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_singular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_singular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_pinv_singular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_slogdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_slogdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_slogdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_slogdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_triangular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_triangular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_triangular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_solve_triangular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_svdvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_svdvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_svdvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_svdvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_tensorinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_tensorinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_tensorinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_tensorinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_tensorsolve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_tensorsolve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_tensorsolve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_tensorsolve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vander_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vander_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vander_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vander_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vander_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vander_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vander_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vander_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vander_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vecdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vecdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vecdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vecdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vecdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vecdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vector_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vector_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vector_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vector_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vector_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linalg_vector_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_linspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_log_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logaddexp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logaddexp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logaddexp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logaddexp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logcumsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logcumsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logcumsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logcumsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logcumsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logcumsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logical_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_long_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_unpack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_unpack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_unpack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_lu_unpack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mH_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mT_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_std_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_std_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_std_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_std_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_std_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_var_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_var_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_var_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_var_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_masked_var_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matrix_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matrix_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matrix_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matrix_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matrix_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_matrix_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_pool2d_with_indices_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_pool2d_with_indices_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_pool2d_with_indices_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_pool2d_with_indices_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_max_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_median_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_median_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_median_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_median_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_median_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_list_of_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_meshgrid_variadic_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_min_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_movedim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_msort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_msort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_msort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_msort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_msort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_msort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_msort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_msort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_msort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_msort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_multinomial_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_multinomial_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_multinomial_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_multinomial_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nan_to_num_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nan_to_num_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nan_to_num_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nan_to_num_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nan_to_num_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nan_to_num_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nan_to_num_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nan_to_num_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nan_to_num_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nan_to_num_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmean_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmedian_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmedian_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmedian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmedian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmedian_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmedian_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmedian_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmedian_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanmedian_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanquantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nanquantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nansum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_narrow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_dropout_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_dropout_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_dropout_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_dropout_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_native_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ne_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_new_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nextafter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nextafter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nextafter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nextafter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_alpha_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_alpha_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_alpha_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_alpha_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_binary_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_binary_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_binary_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_binary_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_celu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_celu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_celu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_celu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_channel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_conv_transpose3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_embedding_loss_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_embedding_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_similarity_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_similarity_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_similarity_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cosine_similarity_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_ctc_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_ctc_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_elu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_elu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_elu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_elu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_embedding_bag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_embedding_bag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_embedding_bag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_embedding_bag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_embedding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_embedding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_embedding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_embedding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_fractional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_fractional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_fractional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_fractional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_fractional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_fractional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_fractional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_fractional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_gaussian_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_gaussian_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_gaussian_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_gaussian_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_gelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_gelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_gelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_gelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_glu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_glu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_glu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_glu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_grid_sample_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_grid_sample_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_grid_sample_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_grid_sample_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_group_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_group_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_group_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_group_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardswish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardswish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardswish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardswish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardtanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardtanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardtanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardtanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardtanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardtanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardtanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hardtanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hinge_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hinge_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hinge_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_hinge_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_huber_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_huber_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_huber_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_huber_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_instance_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_instance_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_instance_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_instance_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_area_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_area_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_area_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_area_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_bicubic_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_bicubic_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_bicubic_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_bicubic_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_trilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_trilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_trilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_interpolate_trilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_kl_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_kl_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_kl_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_kl_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_l1_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_l1_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_leaky_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_leaky_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_leaky_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_leaky_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_linear_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_linear_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_local_response_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_local_response_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_local_response_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_local_response_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_logsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_logsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_logsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_logsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_margin_ranking_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_margin_ranking_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_margin_ranking_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_margin_ranking_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_margin_ranking_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_margin_ranking_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_margin_ranking_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_margin_ranking_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool1d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool1d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool1d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool1d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool2d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool2d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool2d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool2d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool3d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool3d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool3d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_max_unpool3d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_mish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_mish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_mish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_mish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_mse_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_mse_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_mse_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_mse_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multi_head_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multi_head_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multi_head_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multi_head_attention_forward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multi_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multi_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multi_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multi_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multilabel_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multilabel_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multilabel_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multilabel_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_one_hot_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_circular_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_constant_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_reflect_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_reflect_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_reflect_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_reflect_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_reflect_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_reflect_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_reflect_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_reflect_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_reflect_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_reflect_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_reflect_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_negative_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_negative_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_negative_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_negative_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_negative_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_negative_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_negative_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_negative_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_negative_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_negative_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pad_replicate_negative_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pairwise_distance_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pairwise_distance_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pairwise_distance_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pairwise_distance_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pairwise_distance_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pairwise_distance_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pairwise_distance_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pairwise_distance_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pairwise_distance_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pairwise_distance_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pairwise_distance_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_pixel_unshuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_poisson_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_poisson_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_poisson_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_poisson_nll_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_poisson_nll_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_poisson_nll_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_poisson_nll_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_poisson_nll_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_prelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_prelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_prelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_prelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu6_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu6_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu6_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu6_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu6_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu6_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu6_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu6_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu6_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_relu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_rms_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_rms_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_rms_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_rms_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_rms_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_rms_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_rrelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_rrelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_rrelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_rrelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_selu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_selu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_selu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_selu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_silu_complex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_silu_complex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_silu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_silu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_silu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_silu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_smooth_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_smooth_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_smooth_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softmin_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softplus_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softplus_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softplus_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softplus_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_softsign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_tanhshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_tanhshrink_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_tanhshrink_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_tanhshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_tanhshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_tanhshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_tanhshrink_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_tanhshrink_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_tanhshrink_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_tanhshrink_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_tanhshrink_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_threshold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_threshold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_threshold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_threshold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_threshold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_threshold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_threshold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_threshold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_threshold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_upsample_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_upsample_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_upsample_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_upsample_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_upsample_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_upsample_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_upsample_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_upsample_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nn_functional_upsample_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_nonzero_static_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_fro_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_fro_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_fro_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_fro_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_fro_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_fro_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_inf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_inf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_inf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_inf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_inf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_inf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_nuc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_nuc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_nuc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_norm_nuc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_in_place_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_in_place_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_in_place_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_in_place_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_in_place_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_in_place_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_number_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_number_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_number_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_normal_number_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ones_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ormqr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ormqr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ormqr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ormqr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_outer_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pca_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pca_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pca_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pca_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_permute_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pinverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pinverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pinverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pinverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polar_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polar_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_3_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_4_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_4_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_4_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_4_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_4_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_4_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_4_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_4_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_4_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_polygamma_polygamma_n_4_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_positive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_quantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_quantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rad2deg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rad2deg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rad2deg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rad2deg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rad2deg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rad2deg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rad2deg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rad2deg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rad2deg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rad2deg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rand_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rand_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rand_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rand_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rand_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rand_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rand_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randint_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_randn_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_ravel_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_real_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_remainder_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_remainder_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_remainder_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_remainder_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_remainder_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_remainder_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_remainder_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_remainder_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_remainder_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_renorm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_renorm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_renorm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_renorm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_renorm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_renorm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_repeat_interleave_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_reshape_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resize_as__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_resolve_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_roll_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rot90_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_neg_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_neg_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_neg_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_round_decimals_neg_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_rsub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scalar_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_scatter_reduce_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_searchsorted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_searchsorted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_searchsorted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_searchsorted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_searchsorted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_searchsorted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_searchsorted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_searchsorted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_searchsorted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_select_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sgn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_short_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_bartlett_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_bartlett_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_blackman_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_blackman_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_gaussian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_gaussian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_general_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_general_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_general_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_general_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_hann_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_hann_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_kaiser_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_kaiser_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_nuttall_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signal_windows_nuttall_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signbit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signbit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signbit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signbit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signbit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signbit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signbit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signbit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signbit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_signbit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_slice_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sparse_mm_reduce_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sparse_mm_reduce_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sparse_mm_reduce_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sparse_mm_reduce_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sparse_sampled_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sparse_sampled_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sparse_sampled_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sparse_sampled_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_airy_ai_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_airy_ai_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_airy_ai_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_airy_ai_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_airy_ai_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_airy_ai_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_airy_ai_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_airy_ai_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_j1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_bessel_y1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_entr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_entr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_entr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_entr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_entr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_entr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_entr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_entr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_entr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_entr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_erfcx_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_erfcx_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_erfcx_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_erfcx_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_erfcx_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_erfcx_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_erfcx_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_erfcx_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_h_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_h_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_h_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_h_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_h_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_h_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_h_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_h_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_he_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_he_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_he_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_he_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_he_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_he_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_he_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_hermite_polynomial_he_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i0e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i0e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i0e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i0e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i0e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i0e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i0e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i0e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i0e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i0e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_i1e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_laguerre_polynomial_l_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_laguerre_polynomial_l_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_laguerre_polynomial_l_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_laguerre_polynomial_l_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_laguerre_polynomial_l_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_laguerre_polynomial_l_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_laguerre_polynomial_l_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_laguerre_polynomial_l_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_legendre_polynomial_p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_legendre_polynomial_p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_legendre_polynomial_p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_legendre_polynomial_p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_legendre_polynomial_p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_legendre_polynomial_p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_legendre_polynomial_p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_legendre_polynomial_p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_log_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_log_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_log_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_log_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_log_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_log_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_log_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_log_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtri_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtri_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtri_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtri_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtri_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtri_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtri_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_ndtri_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_scaled_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_spherical_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_spherical_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_spherical_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_spherical_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_spherical_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_spherical_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_spherical_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_spherical_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_xlog1py_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_xlog1py_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_xlog1py_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_xlog1py_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_xlog1py_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_xlog1py_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_xlog1py_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_xlog1py_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_xlog1py_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_xlog1py_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_zeta_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_zeta_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_zeta_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_zeta_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_zeta_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_zeta_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_zeta_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_special_zeta_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_list_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_split_with_sizes_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_square_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_squeeze_multiple_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_std_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_stft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_sum_to_size_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_svd_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_svd_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_svd_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_svd_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_along_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_take_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensor_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensordot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensordot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensordot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensordot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensordot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tensordot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tile_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_to_sparse_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_topk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_topk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_topk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_topk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_topk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_topk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_topk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_topk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_topk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch__scaled_mm_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__efficient_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__efficient_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__flash_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_transpose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapz_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapz_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapz_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapz_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapz_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapz_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapz_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapz_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapz_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapz_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trapz_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triangular_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triangular_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triangular_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triangular_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_tril_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_triu_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_true_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unbind_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unflatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unfold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_uniform_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_uniform_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_uniform_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_uniform_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_uniform_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_uniform_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_consecutive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_consecutive_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_consecutive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_consecutive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_consecutive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_consecutive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_consecutive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_consecutive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_consecutive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_consecutive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_uint64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unique_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unravel_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unravel_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unravel_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unravel_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unravel_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsafe_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_unsqueeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_var_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_as_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_view_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_vstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_where_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_xlogy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_xlogy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_xlogy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_xlogy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_xlogy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_xlogy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_xlogy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_xlogy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_xlogy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_xlogy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zero__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_inplace_zeros_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_H_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_T_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___getitem___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___radd___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rand___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rand___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rand___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rand___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rand___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rand___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rdiv___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmatmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmatmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmatmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmatmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmatmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmatmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmod___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmod___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmod___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmod___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmod___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmod___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmod___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmod___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmod___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rmul___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___ror___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___ror___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___ror___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___ror___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___ror___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___ror___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rpow___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rpow___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rpow___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rpow___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rpow___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rpow___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rpow___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rpow___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rpow___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rpow___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rpow___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rsub___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rsub___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rsub___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rsub___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rsub___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rsub___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rsub___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rsub___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rsub___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rsub___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rsub___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rxor___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rxor___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rxor___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rxor___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rxor___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace___rxor___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__batch_norm_with_update_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__batch_norm_with_update_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__batch_norm_with_update_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__batch_norm_with_update_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__chunk_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcdiv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_div_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_frac_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lerp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_norm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__foreach_zero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__native_batch_norm_legit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__native_batch_norm_legit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__native_batch_norm_legit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__native_batch_norm_legit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__segment_reduce_lengths_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__segment_reduce_lengths_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__segment_reduce_lengths_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__segment_reduce_lengths_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__segment_reduce_offsets_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__segment_reduce_offsets_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__segment_reduce_offsets_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__segment_reduce_offsets_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__softmax_backward_data_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__softmax_backward_data_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__softmax_backward_data_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__softmax_backward_data_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__unsafe_masked_index_put_accumulate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__upsample_bilinear2d_aa_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__upsample_bilinear2d_aa_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__upsample_bilinear2d_aa_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace__upsample_bilinear2d_aa_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_acosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_decomposed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_decomposed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_decomposed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_decomposed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_decomposed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmm_decomposed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addmv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_addr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_alias_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_all_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_allclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_allclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_allclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_allclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_allclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_allclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_aminmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_aminmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_aminmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_aminmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_aminmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_aminmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_aminmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_aminmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_aminmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_aminmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_angle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_angle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_angle_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_angle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_angle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_angle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_angle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_angle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_angle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_angle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_angle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_any_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_arange_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_arange_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_arange_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_arange_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_arange_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_arange_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_arange_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_arange_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_arange_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argsort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argsort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argsort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argsort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argsort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argsort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argsort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argsort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argsort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argsort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_argwhere_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_partial_views_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_as_strided_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_asinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_1d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_2d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_atleast_3d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_baddbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_baddbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_baddbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_baddbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_baddbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_baddbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bernoulli_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bernoulli_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bernoulli_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bernoulli_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bfloat16_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bincount_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bincount_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bincount_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bincount_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bincount_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_left_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_left_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_left_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_left_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_left_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_right_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_right_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_right_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_right_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_right_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bitwise_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_block_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bool_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_shapes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_broadcast_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bucketize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bucketize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bucketize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bucketize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bucketize_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bucketize_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bucketize_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bucketize_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_bucketize_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_byte_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cartesian_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cauchy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cauchy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cauchy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cauchy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cdouble_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cfloat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chalf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_char_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_inverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_inverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_inverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_inverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cholesky_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_clone_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_column_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_combinations_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_conj_physical_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_constant_pad_nd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_contiguous_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_copysign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_copysign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_copysign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_copysign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_copysign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_copysign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_copysign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_copysign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_copysign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_copysign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_corrcoef_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_corrcoef_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_corrcoef_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_corrcoef_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_corrcoef_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_corrcoef_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_corrcoef_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_corrcoef_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_corrcoef_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_corrcoef_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_corrcoef_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_count_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cov_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cov_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cov_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cov_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cov_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cov_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cov_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cov_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cov_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cov_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cov_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cummin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumulative_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumulative_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumulative_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumulative_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumulative_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumulative_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumulative_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumulative_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumulative_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumulative_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_cumulative_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_deg2rad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_deg2rad_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_deg2rad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_deg2rad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_deg2rad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_deg2rad_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_deg2rad_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_deg2rad_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_deg2rad_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_deg2rad_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diag_embed_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagflat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diagonal_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_diff_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_digamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_digamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_digamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_digamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_digamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_digamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_digamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_digamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_digamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_digamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dist_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dist_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dist_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dist_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_floor_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_floor_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_floor_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_floor_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_floor_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_floor_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_floor_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_floor_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_floor_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_no_rounding_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_trunc_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_trunc_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_trunc_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_trunc_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_trunc_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_trunc_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_trunc_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_trunc_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_div_trunc_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_double_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_dstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_einsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_einsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_einsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_einsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_einsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_einsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_permuted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eq_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_equal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfinv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfinv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfinv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfinv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfinv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfinv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfinv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_erfinv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expand_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exponential_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exponential_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_float8_e4m3fnuz, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_float8_e5m2, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_float8_e5m2fnuz, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_eye_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_fftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_hfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ifftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_ihfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_irfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fft_rfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flip_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fliplr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_flipud_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_float_power_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_floor_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_fmod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_frexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_frexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_frexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_frexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_full_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gather_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gcd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gcd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gcd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gcd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gcd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ge_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ge_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ge_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ge_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ge_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ge_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ge_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ge_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ge_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ge_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geometric_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geometric_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geometric_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geometric_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geometric_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geometric_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geometric_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geometric_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geometric_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geqrf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geqrf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geqrf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_geqrf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gradient_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gradient_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gradient_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gradient_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gradient_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gradient_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gradient_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gradient_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gradient_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gradient_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_grid_sampler_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_grid_sampler_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_grid_sampler_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_grid_sampler_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_gt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_half_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hash_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hash_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hash_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hash_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hash_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hash_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hash_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hash_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hash_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hash_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_heaviside_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_heaviside_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_heaviside_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_heaviside_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_heaviside_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_heaviside_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_heaviside_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_heaviside_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_heaviside_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_heaviside_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_histc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_histc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_histc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_histc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_histc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_histc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_histc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hypot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hypot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hypot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_hypot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_i0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_i0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_igamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_igamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_igammac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_igammac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_imag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_imag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_imag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_index_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_inner_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_inner_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_inner_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_inner_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_inner_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_inner_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_int_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isclose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isfinite_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isnan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isneginf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isneginf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isneginf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isneginf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isneginf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isneginf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isneginf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isneginf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isneginf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isneginf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isposinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isposinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isposinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isposinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isposinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isposinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isposinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isposinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isposinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isposinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_isreal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_istft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_istft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_item_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_2inputs_2outputs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_4inputs_with_extra_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_binary_return_by_ref_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_jiterator_unary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kron_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kthvalue_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kthvalue_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kthvalue_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kthvalue_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kthvalue_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kthvalue_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kthvalue_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kthvalue_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_kthvalue_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lcm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lcm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lcm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lcm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lcm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ldexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_le_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_le_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_le_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_le_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_le_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_le_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_le_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_le_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_le_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_le_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lerp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cholesky_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cholesky_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cholesky_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cholesky_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cond_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cond_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cond_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cond_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_det_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_det_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_det_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_det_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eig_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eig_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eig_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eig_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigvalsh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigvalsh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigvalsh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_eigvalsh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_householder_product_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_householder_product_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_householder_product_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_householder_product_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_inv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_inv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_inv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_inv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_inv_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_inv_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_inv_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_inv_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_ldl_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lstsq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lstsq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lstsq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lstsq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lstsq_grad_oriented_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lstsq_grad_oriented_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lstsq_grad_oriented_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lstsq_grad_oriented_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_rank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_rank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_rank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_rank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_rank_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_rank_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_rank_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_matrix_rank_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_multi_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_multi_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_multi_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_multi_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_multi_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_multi_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_subgradients_at_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_subgradients_at_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_subgradients_at_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_norm_subgradients_at_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_singular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_singular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_singular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_pinv_singular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_slogdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_slogdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_slogdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_slogdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_triangular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_triangular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_triangular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_solve_triangular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_svdvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_svdvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_svdvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_svdvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_tensorinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_tensorinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_tensorinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_tensorinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_tensorsolve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_tensorsolve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_tensorsolve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_tensorsolve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vander_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vander_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vander_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vander_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vander_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vander_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vander_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vander_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vander_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vecdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vecdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vecdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vecdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vecdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vecdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vector_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vector_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vector_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vector_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vector_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linalg_vector_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_linspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_log_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logaddexp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logaddexp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logaddexp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logaddexp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logcumsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logcumsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logcumsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logcumsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logcumsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logcumsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logical_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_long_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_unpack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_unpack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_unpack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_lu_unpack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mH_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mT_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_std_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_std_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_std_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_std_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_std_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_var_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_var_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_var_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_var_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_masked_var_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matrix_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matrix_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matrix_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matrix_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matrix_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_matrix_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_pool2d_with_indices_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_pool2d_with_indices_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_pool2d_with_indices_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_pool2d_with_indices_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_max_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_median_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_median_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_median_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_median_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_median_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_list_of_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_meshgrid_variadic_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_min_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_movedim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_msort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_msort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_msort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_msort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_msort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_msort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_msort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_msort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_msort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_msort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_multinomial_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_multinomial_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_multinomial_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_multinomial_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nan_to_num_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nan_to_num_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nan_to_num_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nan_to_num_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nan_to_num_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nan_to_num_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nan_to_num_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nan_to_num_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nan_to_num_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nan_to_num_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmean_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmedian_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmedian_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmedian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmedian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmedian_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmedian_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmedian_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmedian_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanmedian_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanquantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nanquantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nansum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_narrow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_dropout_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_dropout_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_dropout_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_dropout_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_native_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ne_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_new_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nextafter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nextafter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nextafter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nextafter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_alpha_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_alpha_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_alpha_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_alpha_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_binary_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_binary_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_binary_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_binary_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_celu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_celu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_celu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_celu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_channel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_conv_transpose3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_embedding_loss_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_embedding_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_similarity_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_similarity_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_similarity_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cosine_similarity_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_ctc_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_ctc_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_elu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_elu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_elu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_elu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_embedding_bag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_embedding_bag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_embedding_bag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_embedding_bag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_embedding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_embedding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_embedding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_embedding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_fractional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_fractional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_fractional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_fractional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_fractional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_fractional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_fractional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_fractional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_gaussian_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_gaussian_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_gaussian_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_gaussian_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_gelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_gelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_gelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_gelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_glu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_glu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_glu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_glu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_grid_sample_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_grid_sample_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_grid_sample_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_grid_sample_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_group_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_group_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_group_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_group_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardswish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardswish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardswish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardswish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardtanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardtanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardtanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardtanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardtanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardtanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardtanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hardtanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hinge_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hinge_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hinge_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_hinge_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_huber_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_huber_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_huber_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_huber_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_instance_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_instance_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_instance_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_instance_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_area_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_area_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_area_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_area_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_bicubic_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_bicubic_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_bicubic_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_bicubic_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_trilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_trilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_trilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_interpolate_trilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_kl_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_kl_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_kl_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_kl_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_l1_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_l1_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_leaky_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_leaky_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_leaky_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_leaky_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_linear_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_linear_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_local_response_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_local_response_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_local_response_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_local_response_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_logsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_logsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_logsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_logsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_margin_ranking_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_margin_ranking_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_margin_ranking_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_margin_ranking_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_margin_ranking_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_margin_ranking_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_margin_ranking_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_margin_ranking_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool1d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool1d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool1d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool1d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool2d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool2d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool2d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool2d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool3d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool3d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool3d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_max_unpool3d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_mish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_mish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_mish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_mish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_mse_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_mse_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_mse_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_mse_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multi_head_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multi_head_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multi_head_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multi_head_attention_forward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multi_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multi_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multi_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multi_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multilabel_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multilabel_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multilabel_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multilabel_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_one_hot_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_circular_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_constant_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_reflect_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_reflect_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_reflect_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_reflect_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_reflect_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_reflect_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_reflect_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_reflect_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_reflect_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_reflect_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_reflect_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_negative_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_negative_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_negative_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_negative_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_negative_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_negative_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_negative_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_negative_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_negative_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_negative_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pad_replicate_negative_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pairwise_distance_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pairwise_distance_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pairwise_distance_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pairwise_distance_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pairwise_distance_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pairwise_distance_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pairwise_distance_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pairwise_distance_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pairwise_distance_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pairwise_distance_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pairwise_distance_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_pixel_unshuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_poisson_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_poisson_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_poisson_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_poisson_nll_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_poisson_nll_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_poisson_nll_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_poisson_nll_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_poisson_nll_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_prelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_prelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_prelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_prelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu6_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu6_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu6_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu6_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu6_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu6_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu6_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu6_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu6_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_relu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_rms_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_rms_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_rms_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_rms_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_rms_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_rms_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_rrelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_rrelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_rrelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_rrelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_selu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_selu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_selu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_selu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_silu_complex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_silu_complex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_silu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_silu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_silu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_silu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_smooth_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_smooth_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_smooth_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softmin_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softplus_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softplus_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softplus_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softplus_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_softsign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_tanhshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_tanhshrink_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_tanhshrink_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_tanhshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_tanhshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_tanhshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_tanhshrink_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_tanhshrink_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_tanhshrink_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_tanhshrink_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_tanhshrink_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_threshold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_threshold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_threshold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_threshold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_threshold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_threshold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_threshold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_threshold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_threshold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_upsample_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_upsample_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_upsample_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_upsample_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_upsample_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_upsample_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_upsample_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_upsample_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nn_functional_upsample_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_nonzero_static_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_fro_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_fro_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_fro_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_fro_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_fro_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_fro_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_inf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_inf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_inf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_inf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_inf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_inf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_nuc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_nuc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_nuc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_norm_nuc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_in_place_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_in_place_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_in_place_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_in_place_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_in_place_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_in_place_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_number_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_number_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_number_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_normal_number_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ones_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ormqr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ormqr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ormqr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ormqr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_outer_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pca_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pca_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pca_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pca_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_permute_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pinverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pinverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pinverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pinverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polar_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polar_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_3_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_4_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_4_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_4_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_4_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_4_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_4_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_4_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_4_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_4_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_polygamma_polygamma_n_4_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_positive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_quantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_quantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rad2deg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rad2deg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rad2deg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rad2deg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rad2deg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rad2deg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rad2deg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rad2deg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rad2deg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rad2deg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rand_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rand_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rand_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rand_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rand_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rand_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rand_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randint_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_randn_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_ravel_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_real_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_remainder_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_remainder_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_remainder_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_remainder_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_remainder_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_remainder_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_remainder_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_remainder_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_remainder_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_renorm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_renorm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_renorm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_renorm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_renorm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_renorm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_repeat_interleave_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_reshape_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resize_as__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_resolve_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_roll_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rot90_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_neg_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_neg_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_neg_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_round_decimals_neg_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_rsub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scalar_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_scatter_reduce_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_searchsorted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_searchsorted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_searchsorted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_searchsorted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_searchsorted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_searchsorted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_searchsorted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_searchsorted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_searchsorted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_select_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sgn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_short_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_bartlett_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_bartlett_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_blackman_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_blackman_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_gaussian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_gaussian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_general_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_general_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_general_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_general_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_hann_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_hann_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_kaiser_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_kaiser_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_nuttall_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signal_windows_nuttall_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signbit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signbit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signbit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signbit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signbit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signbit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signbit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signbit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signbit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_signbit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_slice_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sparse_mm_reduce_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sparse_mm_reduce_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sparse_mm_reduce_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sparse_mm_reduce_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sparse_sampled_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sparse_sampled_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sparse_sampled_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sparse_sampled_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_airy_ai_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_airy_ai_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_airy_ai_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_airy_ai_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_airy_ai_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_airy_ai_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_airy_ai_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_airy_ai_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_j1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_bessel_y1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_entr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_entr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_entr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_entr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_entr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_entr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_entr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_entr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_entr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_entr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_erfcx_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_erfcx_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_erfcx_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_erfcx_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_erfcx_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_erfcx_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_erfcx_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_erfcx_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_h_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_h_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_h_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_h_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_h_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_h_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_h_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_h_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_he_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_he_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_he_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_he_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_he_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_he_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_he_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_hermite_polynomial_he_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i0e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i0e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i0e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i0e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i0e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i0e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i0e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i0e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i0e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i0e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_i1e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_laguerre_polynomial_l_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_laguerre_polynomial_l_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_laguerre_polynomial_l_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_laguerre_polynomial_l_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_laguerre_polynomial_l_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_laguerre_polynomial_l_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_laguerre_polynomial_l_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_laguerre_polynomial_l_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_legendre_polynomial_p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_legendre_polynomial_p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_legendre_polynomial_p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_legendre_polynomial_p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_legendre_polynomial_p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_legendre_polynomial_p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_legendre_polynomial_p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_legendre_polynomial_p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_log_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_log_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_log_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_log_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_log_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_log_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_log_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_log_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtri_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtri_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtri_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtri_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtri_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtri_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtri_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_ndtri_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_scaled_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_spherical_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_spherical_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_spherical_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_spherical_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_spherical_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_spherical_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_spherical_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_spherical_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_xlog1py_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_xlog1py_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_xlog1py_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_xlog1py_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_xlog1py_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_xlog1py_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_xlog1py_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_xlog1py_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_xlog1py_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_xlog1py_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_zeta_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_zeta_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_zeta_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_zeta_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_zeta_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_zeta_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_zeta_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_special_zeta_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_list_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_split_with_sizes_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_square_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_squeeze_multiple_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_std_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_stft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_sum_to_size_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_svd_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_svd_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_svd_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_svd_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_along_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_take_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensor_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensordot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensordot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensordot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensordot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensordot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tensordot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tile_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_to_sparse_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_topk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_topk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_topk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_topk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_topk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_topk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_topk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_topk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_topk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch__scaled_mm_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__efficient_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__efficient_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__flash_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_transpose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapz_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapz_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapz_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapz_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapz_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapz_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapz_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapz_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapz_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapz_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trapz_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triangular_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triangular_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triangular_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triangular_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_tril_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_triu_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_true_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unbind_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unflatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unfold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_uniform_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_uniform_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_uniform_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_uniform_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_uniform_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_uniform_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_consecutive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_consecutive_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_consecutive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_consecutive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_consecutive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_consecutive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_consecutive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_consecutive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_consecutive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_consecutive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_uint64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unique_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unravel_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unravel_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unravel_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unravel_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unravel_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsafe_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_unsqueeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_var_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_as_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_view_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_vstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_where_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_xlogy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_xlogy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_xlogy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_xlogy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_xlogy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_xlogy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_xlogy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_xlogy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_xlogy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_xlogy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zero__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_meta_outplace_zeros_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_H_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_T_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___getitem___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___radd___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rand___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rand___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rand___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rand___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rand___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rand___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rdiv___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmatmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmatmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmatmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmatmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmatmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmatmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmod___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmod___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmod___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmod___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmod___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmod___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmod___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmod___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmod___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rmul___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___ror___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___ror___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___ror___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___ror___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___ror___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___ror___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rpow___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rpow___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rpow___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rpow___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rpow___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rpow___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rpow___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rpow___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rpow___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rpow___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rpow___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rsub___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rsub___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rsub___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rsub___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rsub___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rsub___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rsub___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rsub___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rsub___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rsub___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rsub___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rxor___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rxor___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rxor___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rxor___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rxor___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace___rxor___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__batch_norm_with_update_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__batch_norm_with_update_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__batch_norm_with_update_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__batch_norm_with_update_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__chunk_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcdiv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_div_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_frac_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lerp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_norm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__foreach_zero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__native_batch_norm_legit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__native_batch_norm_legit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__native_batch_norm_legit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__native_batch_norm_legit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__segment_reduce_lengths_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__segment_reduce_lengths_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__segment_reduce_lengths_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__segment_reduce_lengths_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__segment_reduce_offsets_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__segment_reduce_offsets_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__segment_reduce_offsets_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__segment_reduce_offsets_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__softmax_backward_data_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__softmax_backward_data_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__softmax_backward_data_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__softmax_backward_data_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__unsafe_masked_index_put_accumulate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__upsample_bilinear2d_aa_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__upsample_bilinear2d_aa_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__upsample_bilinear2d_aa_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace__upsample_bilinear2d_aa_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_acosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_decomposed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_decomposed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_decomposed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_decomposed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_decomposed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmm_decomposed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addmv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_addr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_alias_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_H_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_T_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides___getitem___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides___radd___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides___rand___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides___rdiv___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides___rmatmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides___rmod___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides___rmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides___ror___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides___rpow___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides___rsub___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides___rxor___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__batch_norm_with_update_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__chunk_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__foreach_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__native_batch_norm_legit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__segment_reduce_lengths_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__segment_reduce_offsets_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__softmax_backward_data_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__unsafe_masked_index_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__unsafe_masked_index_put_accumulate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides__upsample_bilinear2d_aa_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_acosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_addbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_addmm_decomposed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_addmv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_addr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_alias_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_all_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_allclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_aminmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_angle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_any_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_arange_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_argsort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_argwhere_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_as_strided_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_as_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_as_strided_partial_views_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_as_strided_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_asinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_atan2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_atanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_atleast_1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_atleast_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_atleast_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_baddbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bernoulli_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bfloat16_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bincount_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bitwise_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bitwise_left_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bitwise_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bitwise_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bitwise_right_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bitwise_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_block_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bool_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_broadcast_shapes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_broadcast_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_broadcast_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_bucketize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_byte_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cartesian_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cauchy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cdouble_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cfloat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_chalf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_char_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cholesky_inverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cholesky_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_clamp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_clone_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_column_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_combinations_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_conj_physical_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_constant_pad_nd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_contiguous_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_copysign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_corrcoef_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_count_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cov_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cummax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cummin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_cumulative_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_deg2rad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_diag_embed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_diagflat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_diagonal_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_diagonal_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_diff_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_digamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_dist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_div_floor_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_div_no_rounding_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_div_trunc_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_double_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_dsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_dstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_einsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_empty_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_empty_permuted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_eq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_equal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_erfinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_exp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_expand_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_expand_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_expand_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_eye_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_fft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_fft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_fftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_fftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_hfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_hfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_hfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_ifft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_ifft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_ifftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_ifftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_ihfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_ihfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_ihfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_irfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_irfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_irfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_rfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_rfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fft_rfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_flatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_flip_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fliplr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_flipud_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_float_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_float_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_floor_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_fmod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_frexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_full_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_gather_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_gcd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_ge_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_geometric_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_geqrf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_gradient_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_grid_sampler_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_gt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_half_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_hash_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_heaviside_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_histc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_hsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_hstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_hypot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_igamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_igammac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_imag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_index_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_index_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_index_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_index_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_index_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_index_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_index_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_index_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_index_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_inner_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_int_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_isclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_isfinite_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_isin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_isinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_isnan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_isneginf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_isposinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_isreal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_istft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_item_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_jiterator_2inputs_2outputs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_jiterator_4inputs_with_extra_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_jiterator_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_jiterator_binary_return_by_ref_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_jiterator_unary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_kron_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_kthvalue_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_lcm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_ldexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_le_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_cholesky_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_cond_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_det_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_eig_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_eigh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_eigvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_eigvalsh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_householder_product_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_inv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_inv_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_ldl_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_ldl_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_ldl_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_lstsq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_lstsq_grad_oriented_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_lu_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_lu_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_matrix_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_matrix_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_matrix_rank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_matrix_rank_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_multi_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_norm_subgradients_at_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_pinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_pinv_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_pinv_singular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_slogdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_solve_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_solve_triangular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_svdvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_tensorinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_tensorsolve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_vander_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_vecdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linalg_vector_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_linspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_log_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_log_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logaddexp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logcumsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logical_and_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logical_not_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logical_or_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logical_xor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_long_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_lt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_lu_unpack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_mH_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_mT_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_masked_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_matmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_matrix_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_max_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_max_pool2d_with_indices_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_max_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_max_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_meshgrid_list_of_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_meshgrid_variadic_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_min_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_min_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_min_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_mm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_movedim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_msort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_multinomial_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_mv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nan_to_num_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nanmean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nanmedian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nanquantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nansum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_narrow_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_narrow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_native_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_native_dropout_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_native_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_ne_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_new_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_new_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_new_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_new_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_new_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nextafter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_alpha_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_binary_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_celu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_channel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_conv1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_conv2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_conv3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_conv_transpose1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_conv_transpose2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_conv_transpose3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_cosine_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_cosine_similarity_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_ctc_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_dropout2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_dropout3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_elu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_embedding_bag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_embedding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_fractional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_fractional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_gaussian_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_gelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_glu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_grid_sample_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_group_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_hardshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_hardsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_hardswish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_hardtanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_hinge_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_huber_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_instance_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_interpolate_area_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_interpolate_bicubic_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_interpolate_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_interpolate_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_interpolate_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_interpolate_trilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_kl_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_leaky_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_local_response_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_logsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_margin_ranking_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_max_unpool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_max_unpool1d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_max_unpool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_max_unpool2d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_max_unpool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_max_unpool3d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_mish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_mse_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_multi_head_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_multi_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_multilabel_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_one_hot_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_pad_circular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_pad_constant_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_pad_reflect_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_pad_replicate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_pad_replicate_negative_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_pairwise_distance_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_pdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_pixel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_pixel_unshuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_poisson_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_prelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_relu6_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_rms_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_rrelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_selu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_silu_complex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_silu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_smooth_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_softmin_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_softplus_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_softshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_softsign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_tanhshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_threshold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_triplet_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_upsample_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nn_functional_upsample_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_nonzero_static_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_norm_fro_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_norm_inf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_norm_nuc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_normal_in_place_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_normal_number_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_ones_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_ormqr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_outer_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_pca_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_permute_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_permute_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_pinverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_polar_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_polygamma_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_polygamma_polygamma_n_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_polygamma_polygamma_n_2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_polygamma_polygamma_n_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_polygamma_polygamma_n_4_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_positive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_quantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_rad2deg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_rand_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_randint_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_randint_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_randn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_randn_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_ravel_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_real_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_remainder_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_renorm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_repeat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_repeat_interleave_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_reshape_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_reshape_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_resize__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_resize_as__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_resolve_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_resolve_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_roll_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_rot90_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_round_decimals_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_round_decimals_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_round_decimals_neg_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_rsub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_scalar_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_scatter_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_scatter_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_scatter_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_scatter_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_scatter_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_scatter_reduce_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_searchsorted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_select_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sgn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_short_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signal_windows_bartlett_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signal_windows_blackman_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signal_windows_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signal_windows_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signal_windows_gaussian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signal_windows_general_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signal_windows_general_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signal_windows_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signal_windows_hann_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signal_windows_kaiser_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signal_windows_nuttall_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_signbit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sinc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_slice_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_slice_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sparse_mm_reduce_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sparse_sampled_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_airy_ai_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_bessel_j1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_bessel_y0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_bessel_y1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_entr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_erfcx_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_hermite_polynomial_h_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_hermite_polynomial_he_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_i0e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_i1e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_laguerre_polynomial_l_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_legendre_polynomial_p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_log_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_modified_bessel_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_modified_bessel_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_ndtri_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_scaled_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_scaled_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_spherical_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_xlog1py_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_special_zeta_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_split_list_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_split_with_sizes_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_split_with_sizes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_square_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_squeeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_squeeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_squeeze_multiple_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_std_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_std_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_std_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_stft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_sum_to_size_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_svd_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_t_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_take_along_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_take_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_tensor_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_tensordot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_tile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_to_sparse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_topk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_torch__scaled_mm_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_trace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_transpose_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_transpose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_trapz_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_triangular_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_tril_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_tril_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_triu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_triu_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_true_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unbind_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unbind_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unflatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unfold_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_uniform_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unique_consecutive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unique_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unravel_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unsafe_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unsafe_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unsqueeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_unsqueeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_var_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_var_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_var_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_vdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_view_as_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_view_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_view_as_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_view_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_view_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_vsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_vstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_where_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_xlogy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_zero__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_all_strides_zeros_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_allclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_allclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_allclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_allclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_allclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_allclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_aminmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_aminmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_aminmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_aminmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_aminmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_aminmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_aminmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_aminmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_aminmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_aminmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_angle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_angle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_angle_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_angle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_angle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_angle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_angle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_angle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_angle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_angle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_angle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_any_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_arange_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_arange_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_arange_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_arange_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_arange_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_arange_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_arange_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_arange_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_arange_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argsort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argsort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argsort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argsort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argsort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argsort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argsort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argsort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argsort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argsort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_argwhere_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_partial_views_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_as_strided_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_asinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_1d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_2d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_atleast_3d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_baddbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_baddbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_baddbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_baddbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_baddbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_baddbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bernoulli_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bernoulli_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bernoulli_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bernoulli_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bfloat16_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bincount_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bincount_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bincount_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bincount_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bincount_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_left_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_left_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_left_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_left_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_left_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_right_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_right_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_right_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_right_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_right_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bitwise_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_block_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bool_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_shapes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_broadcast_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bucketize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bucketize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bucketize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bucketize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bucketize_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bucketize_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bucketize_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bucketize_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_bucketize_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_byte_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cartesian_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cauchy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cauchy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cauchy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cauchy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cdouble_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cfloat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chalf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_char_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_inverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_inverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_inverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_inverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cholesky_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_clone_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_column_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_combinations_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_conj_physical_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_constant_pad_nd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_contiguous_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_copysign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_copysign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_copysign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_copysign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_copysign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_copysign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_copysign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_copysign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_copysign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_copysign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_corrcoef_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_corrcoef_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_corrcoef_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_corrcoef_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_corrcoef_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_corrcoef_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_corrcoef_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_corrcoef_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_corrcoef_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_corrcoef_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_corrcoef_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_count_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cov_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cov_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cov_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cov_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cov_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cov_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cov_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cov_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cov_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cov_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cov_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cummin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumulative_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumulative_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumulative_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumulative_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumulative_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumulative_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumulative_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumulative_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumulative_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumulative_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_cumulative_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_deg2rad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_deg2rad_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_deg2rad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_deg2rad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_deg2rad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_deg2rad_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_deg2rad_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_deg2rad_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_deg2rad_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_deg2rad_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diag_embed_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagflat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diagonal_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_diff_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_digamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_digamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_digamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_digamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_digamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_digamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_digamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_digamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_digamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_digamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dist_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dist_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dist_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dist_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_floor_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_floor_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_floor_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_floor_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_floor_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_floor_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_floor_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_floor_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_floor_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_no_rounding_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_trunc_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_trunc_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_trunc_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_trunc_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_trunc_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_trunc_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_trunc_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_trunc_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_div_trunc_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_double_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_dstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_einsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_einsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_einsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_einsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_einsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_einsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_permuted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eq_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_equal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfinv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfinv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfinv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfinv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfinv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfinv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfinv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_erfinv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expand_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exponential_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exponential_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_float8_e4m3fnuz, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_float8_e5m2, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_float8_e5m2fnuz, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_eye_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_fftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_hfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ifftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_ihfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_irfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft_rfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flip_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fliplr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_flipud_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_float_power_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_floor_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fmod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_frexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_frexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_frexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_frexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_full_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gather_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gcd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gcd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gcd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gcd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gcd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ge_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ge_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ge_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ge_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ge_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ge_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ge_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ge_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ge_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ge_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geometric_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geometric_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geometric_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geometric_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geometric_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geometric_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geometric_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geometric_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geometric_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geqrf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geqrf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geqrf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_geqrf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gradient_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gradient_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gradient_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gradient_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gradient_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gradient_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gradient_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gradient_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gradient_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gradient_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_grid_sampler_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_grid_sampler_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_grid_sampler_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_grid_sampler_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_gt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_half_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hash_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hash_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hash_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hash_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hash_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hash_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hash_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hash_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hash_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hash_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_heaviside_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_heaviside_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_heaviside_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_heaviside_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_heaviside_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_heaviside_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_heaviside_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_heaviside_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_heaviside_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_heaviside_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_histc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_histc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_histc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_histc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_histc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_histc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_histc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hypot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hypot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hypot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_hypot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_i0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_i0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_igamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_igamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_igammac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_igammac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_imag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_imag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_imag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_index_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_inner_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_inner_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_inner_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_inner_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_inner_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_inner_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_int_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isclose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isfinite_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isnan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isneginf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isneginf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isneginf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isneginf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isneginf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isneginf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isneginf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isneginf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isneginf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isneginf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isposinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isposinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isposinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isposinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isposinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isposinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isposinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isposinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isposinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isposinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_isreal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_istft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_istft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_item_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_2inputs_2outputs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_4inputs_with_extra_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_binary_return_by_ref_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_jiterator_unary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kron_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kthvalue_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kthvalue_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kthvalue_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kthvalue_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kthvalue_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kthvalue_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kthvalue_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kthvalue_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_kthvalue_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lcm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lcm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lcm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lcm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lcm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ldexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_le_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_le_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_le_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_le_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_le_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_le_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_le_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_le_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_le_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_le_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lerp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cholesky_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cholesky_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cholesky_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cholesky_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cond_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cond_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cond_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cond_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_det_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_det_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_det_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_det_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eig_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eig_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eig_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eig_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigvalsh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigvalsh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigvalsh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_eigvalsh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_householder_product_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_householder_product_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_householder_product_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_householder_product_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_inv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_inv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_inv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_inv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_inv_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_inv_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_inv_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_inv_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_ldl_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lstsq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lstsq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lstsq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lstsq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lstsq_grad_oriented_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lstsq_grad_oriented_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lstsq_grad_oriented_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lstsq_grad_oriented_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_rank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_rank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_rank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_rank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_rank_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_rank_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_rank_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_matrix_rank_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_multi_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_multi_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_multi_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_multi_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_multi_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_multi_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_subgradients_at_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_subgradients_at_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_subgradients_at_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_norm_subgradients_at_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_singular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_singular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_singular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_pinv_singular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_slogdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_slogdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_slogdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_slogdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_triangular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_triangular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_triangular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_solve_triangular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_svdvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_svdvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_svdvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_svdvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_tensorinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_tensorinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_tensorinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_tensorinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_tensorsolve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_tensorsolve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_tensorsolve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_tensorsolve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vander_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vander_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vander_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vander_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vander_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vander_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vander_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vander_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vander_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vecdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vecdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vecdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vecdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vecdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vecdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vector_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vector_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vector_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vector_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vector_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linalg_vector_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_linspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_log_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logaddexp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logaddexp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logaddexp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logaddexp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logcumsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logcumsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logcumsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logcumsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logcumsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logcumsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logical_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_long_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_unpack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_unpack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_unpack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_lu_unpack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mH_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mT_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_std_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_std_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_std_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_std_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_std_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_var_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_var_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_var_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_var_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_masked_var_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matrix_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matrix_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matrix_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matrix_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matrix_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_matrix_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_pool2d_with_indices_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_pool2d_with_indices_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_pool2d_with_indices_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_pool2d_with_indices_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_max_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_median_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_median_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_median_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_median_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_median_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_list_of_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_meshgrid_variadic_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_min_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_movedim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_msort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_msort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_msort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_msort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_msort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_msort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_msort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_msort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_msort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_msort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_multinomial_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_multinomial_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_multinomial_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_multinomial_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nan_to_num_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nan_to_num_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nan_to_num_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nan_to_num_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nan_to_num_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nan_to_num_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nan_to_num_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nan_to_num_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nan_to_num_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nan_to_num_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmean_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmedian_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmedian_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmedian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmedian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmedian_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmedian_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmedian_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmedian_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanmedian_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanquantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nanquantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nansum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_narrow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_dropout_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_dropout_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_dropout_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_dropout_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_native_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ne_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_new_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nextafter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nextafter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nextafter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nextafter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_alpha_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_alpha_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_alpha_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_alpha_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_binary_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_binary_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_binary_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_binary_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_celu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_celu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_celu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_celu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_channel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_conv_transpose3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_embedding_loss_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_embedding_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_similarity_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_similarity_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_similarity_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cosine_similarity_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_ctc_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_ctc_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_elu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_elu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_elu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_elu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_embedding_bag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_embedding_bag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_embedding_bag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_embedding_bag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_embedding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_embedding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_embedding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_embedding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_fractional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_fractional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_fractional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_fractional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_fractional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_fractional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_fractional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_fractional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_gaussian_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_gaussian_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_gaussian_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_gaussian_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_gelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_gelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_gelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_gelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_glu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_glu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_glu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_glu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_grid_sample_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_grid_sample_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_grid_sample_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_grid_sample_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_group_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_group_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_group_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_group_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardswish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardswish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardswish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardswish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardtanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardtanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardtanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardtanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardtanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardtanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardtanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hardtanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hinge_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hinge_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hinge_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_hinge_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_huber_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_huber_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_huber_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_huber_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_instance_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_instance_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_instance_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_instance_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_area_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_area_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_area_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_area_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_bicubic_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_bicubic_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_bicubic_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_bicubic_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_trilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_trilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_trilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_interpolate_trilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_kl_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_kl_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_kl_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_kl_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_l1_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_l1_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_leaky_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_leaky_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_leaky_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_leaky_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_linear_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_linear_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_local_response_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_local_response_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_local_response_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_local_response_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_logsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_logsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_logsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_logsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_margin_ranking_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_margin_ranking_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_margin_ranking_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_margin_ranking_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_margin_ranking_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_margin_ranking_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_margin_ranking_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_margin_ranking_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool1d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool1d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool1d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool1d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool2d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool2d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool2d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool2d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool3d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool3d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool3d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_max_unpool3d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_mish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_mish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_mish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_mish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_mse_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_mse_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_mse_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_mse_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multi_head_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multi_head_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multi_head_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multi_head_attention_forward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multi_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multi_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multi_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multi_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multilabel_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multilabel_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multilabel_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multilabel_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_one_hot_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_circular_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_constant_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_reflect_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_reflect_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_reflect_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_reflect_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_reflect_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_reflect_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_reflect_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_reflect_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_reflect_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_reflect_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_reflect_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_negative_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_negative_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_negative_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_negative_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_negative_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_negative_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_negative_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_negative_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_negative_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_negative_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pad_replicate_negative_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pairwise_distance_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pairwise_distance_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pairwise_distance_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pairwise_distance_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pairwise_distance_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pairwise_distance_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pairwise_distance_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pairwise_distance_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pairwise_distance_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pairwise_distance_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pairwise_distance_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_pixel_unshuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_poisson_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_poisson_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_poisson_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_poisson_nll_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_poisson_nll_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_poisson_nll_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_poisson_nll_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_poisson_nll_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_prelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_prelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_prelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_prelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu6_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu6_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu6_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu6_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu6_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu6_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu6_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu6_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu6_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_relu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_rms_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_rms_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_rms_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_rms_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_rms_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_rms_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_rrelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_rrelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_rrelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_rrelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_selu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_selu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_selu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_selu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_silu_complex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_silu_complex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_silu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_silu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_silu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_silu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_smooth_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_smooth_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_smooth_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softmin_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softplus_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softplus_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softplus_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softplus_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_softsign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_tanhshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_tanhshrink_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_tanhshrink_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_tanhshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_tanhshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_tanhshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_tanhshrink_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_tanhshrink_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_tanhshrink_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_tanhshrink_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_tanhshrink_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_threshold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_threshold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_threshold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_threshold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_threshold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_threshold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_threshold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_threshold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_threshold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_upsample_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_upsample_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_upsample_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_upsample_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_upsample_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_upsample_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_upsample_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_upsample_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nn_functional_upsample_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_nonzero_static_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_fro_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_fro_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_fro_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_fro_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_fro_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_fro_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_inf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_inf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_inf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_inf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_inf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_inf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_nuc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_nuc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_nuc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_norm_nuc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_in_place_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_in_place_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_in_place_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_in_place_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_in_place_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_in_place_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_number_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_number_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_number_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_normal_number_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ones_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ormqr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ormqr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ormqr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ormqr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_outer_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pca_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pca_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pca_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pca_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_permute_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pinverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pinverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pinverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pinverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polar_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polar_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_3_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_4_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_4_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_4_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_4_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_4_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_4_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_4_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_4_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_4_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_polygamma_polygamma_n_4_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_positive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_quantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_quantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rad2deg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rad2deg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rad2deg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rad2deg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rad2deg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rad2deg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rad2deg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rad2deg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rad2deg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rad2deg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rand_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rand_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rand_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rand_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rand_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rand_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rand_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randint_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_randn_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_ravel_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_real_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_remainder_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_remainder_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_remainder_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_remainder_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_remainder_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_remainder_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_remainder_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_remainder_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_remainder_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_renorm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_renorm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_renorm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_renorm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_renorm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_renorm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_repeat_interleave_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_reshape_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resize_as__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_resolve_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_roll_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rot90_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_neg_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_neg_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_neg_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_round_decimals_neg_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_rsub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scalar_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_scatter_reduce_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_searchsorted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_searchsorted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_searchsorted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_searchsorted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_searchsorted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_searchsorted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_searchsorted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_searchsorted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_searchsorted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_select_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sgn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_short_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_bartlett_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_bartlett_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_blackman_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_blackman_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_gaussian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_gaussian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_general_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_general_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_general_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_general_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_hann_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_hann_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_kaiser_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_kaiser_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_nuttall_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signal_windows_nuttall_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signbit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signbit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signbit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signbit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signbit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signbit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signbit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signbit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signbit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_signbit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_slice_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sparse_mm_reduce_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sparse_mm_reduce_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sparse_mm_reduce_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sparse_mm_reduce_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sparse_sampled_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sparse_sampled_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sparse_sampled_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sparse_sampled_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_airy_ai_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_airy_ai_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_airy_ai_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_airy_ai_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_airy_ai_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_airy_ai_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_airy_ai_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_airy_ai_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_j1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_bessel_y1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_entr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_entr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_entr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_entr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_entr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_entr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_entr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_entr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_entr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_entr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_erfcx_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_erfcx_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_erfcx_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_erfcx_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_erfcx_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_erfcx_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_erfcx_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_erfcx_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_h_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_h_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_h_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_h_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_h_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_h_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_h_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_h_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_he_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_he_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_he_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_he_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_he_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_he_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_he_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_hermite_polynomial_he_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i0e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i0e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i0e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i0e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i0e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i0e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i0e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i0e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i0e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i0e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_i1e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_laguerre_polynomial_l_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_laguerre_polynomial_l_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_laguerre_polynomial_l_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_laguerre_polynomial_l_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_laguerre_polynomial_l_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_laguerre_polynomial_l_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_laguerre_polynomial_l_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_laguerre_polynomial_l_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_legendre_polynomial_p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_legendre_polynomial_p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_legendre_polynomial_p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_legendre_polynomial_p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_legendre_polynomial_p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_legendre_polynomial_p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_legendre_polynomial_p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_legendre_polynomial_p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_log_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_log_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_log_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_log_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_log_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_log_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_log_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_log_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtri_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtri_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtri_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtri_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtri_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtri_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtri_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_ndtri_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_scaled_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_spherical_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_spherical_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_spherical_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_spherical_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_spherical_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_spherical_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_spherical_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_spherical_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_xlog1py_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_xlog1py_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_xlog1py_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_xlog1py_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_xlog1py_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_xlog1py_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_xlog1py_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_xlog1py_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_xlog1py_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_xlog1py_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_zeta_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_zeta_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_zeta_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_zeta_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_zeta_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_zeta_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_zeta_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_special_zeta_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_list_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_split_with_sizes_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_square_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_squeeze_multiple_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_std_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_stft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_sum_to_size_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_svd_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_svd_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_svd_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_svd_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_along_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_take_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensor_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensordot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensordot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensordot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensordot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensordot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tensordot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tile_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_to_sparse_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_topk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_topk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_topk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_topk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_topk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_topk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_topk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_topk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_topk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch__scaled_mm_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__efficient_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__efficient_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__flash_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_transpose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapz_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapz_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapz_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapz_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapz_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapz_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapz_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapz_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapz_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapz_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trapz_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triangular_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triangular_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triangular_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triangular_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_tril_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_triu_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_true_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unbind_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unflatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unfold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_uniform_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_uniform_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_uniform_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_uniform_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_uniform_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_uniform_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_consecutive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_consecutive_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_consecutive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_consecutive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_consecutive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_consecutive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_consecutive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_consecutive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_consecutive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_consecutive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_uint64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unique_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unravel_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unravel_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unravel_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unravel_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unravel_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsafe_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_unsqueeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_var_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_as_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_view_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_vstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_where_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_xlogy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_xlogy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_xlogy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_xlogy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_xlogy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_xlogy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_xlogy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_xlogy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_xlogy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_xlogy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zero__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_zeros_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_H_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_T_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___getitem___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___radd___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rand___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rand___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rand___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rand___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rand___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rand___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rdiv___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmatmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmatmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmatmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmatmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmatmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmatmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmod___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmod___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmod___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmod___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmod___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmod___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmod___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmod___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmod___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rmul___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___ror___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___ror___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___ror___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___ror___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___ror___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___ror___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rpow___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rpow___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rpow___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rpow___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rpow___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rpow___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rpow___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rpow___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rpow___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rpow___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rpow___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rsub___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rsub___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rsub___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rsub___cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rsub___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rsub___cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rsub___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rsub___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rsub___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rsub___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rsub___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rxor___cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rxor___cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rxor___cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rxor___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rxor___cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace___rxor___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__batch_norm_with_update_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__batch_norm_with_update_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__batch_norm_with_update_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__batch_norm_with_update_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__chunk_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcdiv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_div_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_frac_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lerp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_norm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__foreach_zero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__native_batch_norm_legit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__native_batch_norm_legit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__native_batch_norm_legit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__native_batch_norm_legit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__segment_reduce_lengths_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__segment_reduce_lengths_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__segment_reduce_lengths_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__segment_reduce_lengths_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__segment_reduce_offsets_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__segment_reduce_offsets_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__segment_reduce_offsets_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__segment_reduce_offsets_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__softmax_backward_data_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__softmax_backward_data_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__softmax_backward_data_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__softmax_backward_data_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__unsafe_masked_index_put_accumulate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__upsample_bilinear2d_aa_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__upsample_bilinear2d_aa_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__upsample_bilinear2d_aa_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace__upsample_bilinear2d_aa_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_acosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_decomposed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_decomposed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_decomposed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_decomposed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_decomposed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmm_decomposed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addmv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_addr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_alias_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_H_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_T_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides___getitem___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides___radd___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides___rand___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides___rdiv___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides___rmatmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides___rmod___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides___rmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides___ror___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides___rpow___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides___rsub___cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides___rxor___cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__batch_norm_with_update_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__chunk_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__foreach_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__native_batch_norm_legit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__segment_reduce_lengths_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__segment_reduce_offsets_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__softmax_backward_data_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__unsafe_masked_index_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__unsafe_masked_index_put_accumulate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides__upsample_bilinear2d_aa_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_acosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_addbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_addmm_decomposed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_addmv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_addr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_alias_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_all_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_allclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_aminmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_angle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_any_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_arange_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_argsort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_argwhere_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_as_strided_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_as_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_as_strided_partial_views_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_as_strided_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_asinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_atan2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_atanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_atleast_1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_atleast_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_atleast_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_baddbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bernoulli_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bfloat16_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bincount_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bitwise_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bitwise_left_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bitwise_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bitwise_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bitwise_right_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bitwise_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_block_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bool_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_broadcast_shapes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_broadcast_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_broadcast_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_bucketize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_byte_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cartesian_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cauchy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cdouble_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cfloat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_chalf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_char_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cholesky_inverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cholesky_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_clamp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_clone_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_column_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_combinations_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_conj_physical_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_constant_pad_nd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_contiguous_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_copysign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_corrcoef_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_count_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cov_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cummax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cummin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_cumulative_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_deg2rad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_diag_embed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_diagflat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_diagonal_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_diagonal_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_diff_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_digamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_dist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_div_floor_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_div_no_rounding_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_div_trunc_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_double_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_dsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_dstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_einsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_empty_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_empty_permuted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_eq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_equal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_erfinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_exp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_expand_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_expand_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_expand_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_eye_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_fft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_fft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_fftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_fftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_hfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_hfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_hfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_ifft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_ifft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_ifftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_ifftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_ihfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_ihfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_ihfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_irfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_irfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_irfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_rfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_rfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fft_rfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_flatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_flip_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fliplr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_flipud_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_float_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_float_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_floor_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_fmod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_frexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_full_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_gather_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_gcd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_ge_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_geometric_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_geqrf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_gradient_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_grid_sampler_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_gt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_half_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_hash_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_heaviside_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_histc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_hsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_hstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_hypot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_igamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_igammac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_imag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_index_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_inner_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_int_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_isclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_isfinite_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_isin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_isinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_isnan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_isneginf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_isposinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_isreal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_istft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_item_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_jiterator_2inputs_2outputs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_jiterator_4inputs_with_extra_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_jiterator_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_jiterator_binary_return_by_ref_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_jiterator_unary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_kron_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_kthvalue_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_lcm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_ldexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_le_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_cholesky_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_cond_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_det_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_eig_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_eigh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_eigvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_eigvalsh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_householder_product_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_inv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_inv_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_ldl_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_ldl_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_ldl_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_lstsq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_lstsq_grad_oriented_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_lu_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_lu_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_matrix_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_matrix_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_matrix_rank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_matrix_rank_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_multi_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_norm_subgradients_at_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_pinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_pinv_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_pinv_singular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_slogdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_solve_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_solve_triangular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_svdvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_tensorinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_tensorsolve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_vander_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_vecdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linalg_vector_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_linspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_log_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_log_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logaddexp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logcumsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logical_and_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logical_not_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logical_or_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logical_xor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_long_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_lt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_lu_unpack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_mH_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_mT_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_masked_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_matmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_matrix_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_max_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_max_pool2d_with_indices_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_max_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_max_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_meshgrid_list_of_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_meshgrid_variadic_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_min_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_min_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_min_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_mm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_movedim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_msort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_multinomial_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_mv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nan_to_num_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nanmean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nanmedian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nanquantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nansum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_narrow_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_narrow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_native_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_native_dropout_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_native_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_ne_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_new_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_new_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_new_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_new_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_new_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nextafter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_alpha_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_binary_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_celu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_channel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_conv1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_conv2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_conv3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_conv_transpose1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_conv_transpose2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_conv_transpose3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_cosine_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_cosine_similarity_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_ctc_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_dropout2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_dropout3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_elu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_embedding_bag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_embedding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_fractional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_fractional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_gaussian_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_gelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_glu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_grid_sample_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_group_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_hardshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_hardsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_hardswish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_hardtanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_hinge_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_huber_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_instance_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_interpolate_area_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_interpolate_bicubic_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_interpolate_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_interpolate_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_interpolate_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_interpolate_trilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_kl_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_leaky_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_local_response_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_logsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_margin_ranking_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_max_unpool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_max_unpool1d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_max_unpool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_max_unpool2d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_max_unpool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_max_unpool3d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_mish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_mse_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_multi_head_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_multi_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_multilabel_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_one_hot_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_pad_circular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_pad_constant_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_pad_reflect_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_pad_replicate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_pad_replicate_negative_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_pairwise_distance_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_pdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_pixel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_pixel_unshuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_poisson_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_prelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_relu6_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_rms_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_rrelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_selu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_silu_complex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_silu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_smooth_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_softmin_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_softplus_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_softshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_softsign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_tanhshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_threshold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_triplet_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_upsample_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_upsample_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_nonzero_static_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_norm_fro_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_norm_inf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_norm_nuc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_normal_in_place_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_normal_number_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_ones_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_ormqr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_outer_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_pca_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_permute_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_permute_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_pinverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_polar_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_polygamma_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_polygamma_polygamma_n_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_polygamma_polygamma_n_2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_polygamma_polygamma_n_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_polygamma_polygamma_n_4_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_positive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_quantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_rad2deg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_rand_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_randint_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_randint_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_randn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_randn_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_ravel_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_real_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_remainder_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_renorm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_repeat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_repeat_interleave_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_reshape_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_reshape_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_resize__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_resize_as__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_resolve_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_resolve_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_roll_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_rot90_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_round_decimals_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_round_decimals_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_round_decimals_neg_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_rsub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_scalar_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_scatter_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_scatter_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_scatter_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_scatter_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_scatter_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_scatter_reduce_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_searchsorted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_select_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sgn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_short_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signal_windows_bartlett_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signal_windows_blackman_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signal_windows_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signal_windows_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signal_windows_gaussian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signal_windows_general_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signal_windows_general_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signal_windows_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signal_windows_hann_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signal_windows_kaiser_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signal_windows_nuttall_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_signbit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sinc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_slice_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_slice_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sparse_mm_reduce_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sparse_sampled_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_airy_ai_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_bessel_j1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_bessel_y0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_bessel_y1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_entr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_erfcx_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_hermite_polynomial_h_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_hermite_polynomial_he_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_i0e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_i1e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_laguerre_polynomial_l_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_legendre_polynomial_p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_log_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_modified_bessel_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_modified_bessel_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_ndtri_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_scaled_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_scaled_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_spherical_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_xlog1py_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_special_zeta_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_split_list_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_split_with_sizes_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_split_with_sizes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_square_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_squeeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_squeeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_squeeze_multiple_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_std_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_std_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_std_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_stft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_sum_to_size_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_svd_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_t_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_take_along_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_take_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_tensor_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_tensordot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_tile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_to_sparse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_topk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_torch__scaled_mm_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_trace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_transpose_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_transpose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_trapz_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_triangular_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_tril_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_tril_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_triu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_triu_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_true_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unbind_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unbind_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unflatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unfold_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_uniform_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unique_consecutive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unique_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unravel_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unsafe_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unsafe_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unsqueeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_unsqueeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_var_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_var_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_var_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_vdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_view_as_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_view_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_view_as_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_view_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_view_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_vsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_vstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_where_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_xlogy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_zero__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_all_strides_zeros_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_allclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_allclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_allclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_allclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_allclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_allclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_aminmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_aminmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_aminmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_aminmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_aminmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_aminmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_aminmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_aminmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_aminmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_aminmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_angle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_angle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_angle_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_angle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_angle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_angle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_angle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_angle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_angle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_angle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_angle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_any_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_arange_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_arange_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_arange_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_arange_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_arange_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_arange_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_arange_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_arange_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_arange_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argsort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argsort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argsort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argsort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argsort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argsort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argsort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argsort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argsort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argsort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_argwhere_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_partial_views_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_as_strided_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_asinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_1d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_2d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_atleast_3d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_baddbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_baddbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_baddbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_baddbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_baddbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_baddbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bernoulli_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bernoulli_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bernoulli_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bernoulli_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bfloat16_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bincount_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bincount_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bincount_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bincount_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bincount_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_left_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_left_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_left_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_left_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_left_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_right_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_right_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_right_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_right_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_right_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bitwise_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_block_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bool_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_shapes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_broadcast_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bucketize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bucketize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bucketize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bucketize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bucketize_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bucketize_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bucketize_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bucketize_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_bucketize_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_byte_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cartesian_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cauchy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cauchy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cauchy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cauchy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cdouble_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cfloat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chalf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_char_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_inverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_inverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_inverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_inverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cholesky_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_clone_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_column_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_combinations_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_conj_physical_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_constant_pad_nd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_contiguous_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_copysign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_copysign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_copysign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_copysign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_copysign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_copysign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_copysign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_copysign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_copysign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_copysign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_corrcoef_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_corrcoef_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_corrcoef_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_corrcoef_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_corrcoef_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_corrcoef_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_corrcoef_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_corrcoef_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_corrcoef_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_corrcoef_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_corrcoef_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_count_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cov_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cov_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cov_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cov_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cov_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cov_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cov_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cov_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cov_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cov_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cov_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cummin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumulative_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumulative_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumulative_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumulative_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumulative_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumulative_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumulative_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumulative_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumulative_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumulative_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_cumulative_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_deg2rad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_deg2rad_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_deg2rad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_deg2rad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_deg2rad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_deg2rad_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_deg2rad_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_deg2rad_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_deg2rad_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_deg2rad_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diag_embed_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagflat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diagonal_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_diff_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_digamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_digamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_digamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_digamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_digamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_digamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_digamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_digamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_digamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_digamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dist_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dist_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dist_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dist_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_floor_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_floor_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_floor_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_floor_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_floor_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_floor_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_floor_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_floor_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_floor_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_no_rounding_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_trunc_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_trunc_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_trunc_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_trunc_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_trunc_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_trunc_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_trunc_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_trunc_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_div_trunc_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_double_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_dstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_einsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_einsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_einsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_einsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_einsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_einsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_permuted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eq_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_equal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfinv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfinv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfinv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfinv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfinv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfinv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfinv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_erfinv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expand_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exponential_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exponential_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_float8_e4m3fnuz, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_float8_e5m2, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_float8_e5m2fnuz, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_eye_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_fftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_hfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ifftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_ihfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_irfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft_rfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flip_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fliplr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_flipud_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_float_power_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_floor_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fmod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_frexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_frexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_frexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_frexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_full_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gather_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gcd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gcd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gcd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gcd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gcd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ge_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ge_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ge_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ge_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ge_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ge_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ge_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ge_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ge_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ge_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geometric_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geometric_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geometric_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geometric_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geometric_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geometric_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geometric_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geometric_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geometric_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geqrf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geqrf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geqrf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_geqrf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gradient_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gradient_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gradient_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gradient_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gradient_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gradient_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gradient_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gradient_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gradient_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gradient_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_grid_sampler_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_grid_sampler_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_grid_sampler_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_grid_sampler_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_gt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_half_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hash_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hash_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hash_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hash_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hash_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hash_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hash_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hash_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hash_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hash_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_heaviside_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_heaviside_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_heaviside_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_heaviside_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_heaviside_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_heaviside_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_heaviside_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_heaviside_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_heaviside_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_heaviside_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_histc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_histc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_histc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_histc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_histc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_histc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_histc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hypot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hypot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hypot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_hypot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_i0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_i0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_igamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_igamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_igammac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_igammac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_imag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_imag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_imag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_index_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_inner_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_inner_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_inner_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_inner_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_inner_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_inner_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_int_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isclose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isfinite_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isnan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isneginf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isneginf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isneginf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isneginf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isneginf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isneginf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isneginf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isneginf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isneginf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isneginf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isposinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isposinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isposinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isposinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isposinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isposinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isposinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isposinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isposinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isposinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_isreal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_istft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_istft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_item_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_2inputs_2outputs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_4inputs_with_extra_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_binary_return_by_ref_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_jiterator_unary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kron_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kthvalue_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kthvalue_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kthvalue_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kthvalue_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kthvalue_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kthvalue_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kthvalue_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kthvalue_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_kthvalue_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lcm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lcm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lcm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lcm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lcm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ldexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_le_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_le_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_le_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_le_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_le_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_le_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_le_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_le_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_le_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_le_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lerp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cholesky_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cholesky_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cholesky_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cholesky_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cond_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cond_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cond_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cond_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_det_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_det_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_det_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_det_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eig_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eig_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eig_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eig_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigvalsh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigvalsh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigvalsh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_eigvalsh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_householder_product_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_householder_product_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_householder_product_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_householder_product_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_inv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_inv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_inv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_inv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_inv_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_inv_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_inv_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_inv_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_ldl_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lstsq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lstsq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lstsq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lstsq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lstsq_grad_oriented_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lstsq_grad_oriented_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lstsq_grad_oriented_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lstsq_grad_oriented_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_rank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_rank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_rank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_rank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_rank_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_rank_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_rank_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_matrix_rank_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_multi_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_multi_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_multi_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_multi_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_multi_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_multi_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_subgradients_at_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_subgradients_at_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_subgradients_at_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_norm_subgradients_at_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_singular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_singular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_singular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_pinv_singular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_slogdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_slogdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_slogdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_slogdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_triangular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_triangular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_triangular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_solve_triangular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_svdvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_svdvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_svdvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_svdvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_tensorinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_tensorinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_tensorinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_tensorinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_tensorsolve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_tensorsolve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_tensorsolve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_tensorsolve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vander_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vander_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vander_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vander_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vander_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vander_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vander_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vander_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vander_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vecdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vecdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vecdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vecdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vecdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vecdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vector_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vector_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vector_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vector_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vector_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linalg_vector_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_linspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_log_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logaddexp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logaddexp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logaddexp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logaddexp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logcumsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logcumsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logcumsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logcumsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logcumsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logcumsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logical_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_long_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_unpack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_unpack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_unpack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_lu_unpack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mH_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mT_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_std_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_std_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_std_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_std_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_std_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_var_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_var_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_var_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_var_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_masked_var_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matrix_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matrix_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matrix_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matrix_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matrix_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_matrix_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_pool2d_with_indices_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_pool2d_with_indices_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_pool2d_with_indices_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_pool2d_with_indices_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_max_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_median_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_median_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_median_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_median_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_median_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_list_of_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_meshgrid_variadic_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_min_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_movedim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_msort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_msort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_msort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_msort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_msort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_msort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_msort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_msort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_msort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_msort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_multinomial_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_multinomial_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_multinomial_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_multinomial_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nan_to_num_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nan_to_num_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nan_to_num_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nan_to_num_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nan_to_num_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nan_to_num_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nan_to_num_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nan_to_num_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nan_to_num_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nan_to_num_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmean_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmedian_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmedian_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmedian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmedian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmedian_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmedian_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmedian_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmedian_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanmedian_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanquantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nanquantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nansum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_narrow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_dropout_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_dropout_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_dropout_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_dropout_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_native_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ne_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_new_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nextafter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nextafter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nextafter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nextafter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_alpha_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_alpha_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_alpha_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_alpha_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_binary_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_binary_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_binary_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_binary_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_celu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_celu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_celu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_celu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_channel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_conv_transpose3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_embedding_loss_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_embedding_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_similarity_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_similarity_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_similarity_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cosine_similarity_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_ctc_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_ctc_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_elu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_elu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_elu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_elu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_embedding_bag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_embedding_bag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_embedding_bag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_embedding_bag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_embedding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_embedding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_embedding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_embedding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_fractional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_fractional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_fractional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_fractional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_fractional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_fractional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_fractional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_fractional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_gaussian_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_gaussian_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_gaussian_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_gaussian_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_gelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_gelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_gelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_gelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_glu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_glu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_glu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_glu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_grid_sample_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_grid_sample_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_grid_sample_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_grid_sample_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_group_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_group_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_group_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_group_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardswish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardswish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardswish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardswish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardtanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardtanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardtanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardtanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardtanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardtanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardtanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hardtanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hinge_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hinge_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hinge_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_hinge_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_huber_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_huber_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_huber_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_huber_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_instance_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_instance_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_instance_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_instance_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_area_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_area_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_area_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_area_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_bicubic_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_bicubic_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_bicubic_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_bicubic_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_trilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_trilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_trilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_interpolate_trilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_kl_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_kl_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_kl_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_kl_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_l1_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_l1_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_leaky_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_leaky_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_leaky_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_leaky_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_linear_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_linear_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_local_response_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_local_response_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_local_response_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_local_response_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_logsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_logsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_logsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_logsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_margin_ranking_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_margin_ranking_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_margin_ranking_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_margin_ranking_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_margin_ranking_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_margin_ranking_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_margin_ranking_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_margin_ranking_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool1d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool1d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool1d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool1d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool2d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool2d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool2d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool2d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool3d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool3d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool3d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_max_unpool3d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_mish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_mish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_mish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_mish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_mse_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_mse_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_mse_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_mse_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multi_head_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multi_head_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multi_head_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multi_head_attention_forward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multi_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multi_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multi_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multi_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multilabel_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multilabel_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multilabel_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multilabel_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_one_hot_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_circular_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_constant_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_reflect_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_reflect_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_reflect_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_reflect_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_reflect_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_reflect_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_reflect_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_reflect_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_reflect_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_reflect_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_reflect_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_negative_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_negative_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_negative_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_negative_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_negative_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_negative_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_negative_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_negative_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_negative_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_negative_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pad_replicate_negative_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pairwise_distance_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pairwise_distance_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pairwise_distance_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pairwise_distance_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pairwise_distance_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pairwise_distance_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pairwise_distance_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pairwise_distance_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pairwise_distance_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pairwise_distance_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pairwise_distance_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_pixel_unshuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_poisson_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_poisson_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_poisson_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_poisson_nll_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_poisson_nll_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_poisson_nll_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_poisson_nll_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_poisson_nll_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_prelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_prelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_prelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_prelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu6_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu6_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu6_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu6_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu6_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu6_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu6_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu6_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu6_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_relu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_rms_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_rms_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_rms_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_rms_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_rms_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_rms_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_rrelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_rrelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_rrelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_rrelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_selu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_selu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_selu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_selu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_silu_complex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_silu_complex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_silu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_silu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_silu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_silu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_smooth_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_smooth_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_smooth_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softmin_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softplus_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softplus_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softplus_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softplus_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_softsign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_tanhshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_tanhshrink_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_tanhshrink_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_tanhshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_tanhshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_tanhshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_tanhshrink_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_tanhshrink_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_tanhshrink_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_tanhshrink_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_tanhshrink_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_threshold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_threshold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_threshold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_threshold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_threshold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_threshold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_threshold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_threshold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_threshold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_upsample_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_upsample_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_upsample_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_upsample_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_upsample_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_upsample_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_upsample_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_upsample_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nn_functional_upsample_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_nonzero_static_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_fro_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_fro_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_fro_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_fro_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_fro_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_fro_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_inf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_inf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_inf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_inf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_inf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_inf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_nuc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_nuc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_nuc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_norm_nuc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_in_place_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_in_place_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_in_place_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_in_place_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_in_place_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_in_place_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_number_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_number_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_number_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_normal_number_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ones_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ormqr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ormqr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ormqr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ormqr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_outer_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pca_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pca_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pca_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pca_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_permute_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pinverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pinverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pinverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pinverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polar_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polar_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_3_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_4_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_4_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_4_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_4_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_4_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_4_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_4_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_4_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_4_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_polygamma_polygamma_n_4_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_positive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_quantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_quantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rad2deg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rad2deg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rad2deg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rad2deg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rad2deg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rad2deg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rad2deg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rad2deg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rad2deg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rad2deg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rand_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rand_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rand_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rand_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rand_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rand_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rand_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randint_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_randn_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_ravel_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_real_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_remainder_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_remainder_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_remainder_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_remainder_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_remainder_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_remainder_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_remainder_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_remainder_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_remainder_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_renorm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_renorm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_renorm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_renorm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_renorm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_renorm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_repeat_interleave_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_reshape_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resize_as__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_resolve_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_roll_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rot90_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_neg_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_neg_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_neg_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_round_decimals_neg_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_rsub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scalar_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_scatter_reduce_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_searchsorted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_searchsorted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_searchsorted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_searchsorted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_searchsorted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_searchsorted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_searchsorted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_searchsorted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_searchsorted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_select_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sgn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_short_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_bartlett_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_bartlett_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_blackman_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_blackman_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_gaussian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_gaussian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_general_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_general_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_general_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_general_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_hann_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_hann_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_kaiser_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_kaiser_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_nuttall_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signal_windows_nuttall_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signbit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signbit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signbit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signbit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signbit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signbit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signbit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signbit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signbit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_signbit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_slice_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sparse_mm_reduce_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sparse_mm_reduce_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sparse_mm_reduce_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sparse_mm_reduce_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sparse_sampled_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sparse_sampled_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sparse_sampled_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sparse_sampled_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_airy_ai_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_airy_ai_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_airy_ai_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_airy_ai_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_airy_ai_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_airy_ai_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_airy_ai_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_airy_ai_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_j1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_bessel_y1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_entr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_entr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_entr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_entr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_entr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_entr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_entr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_entr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_entr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_entr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_erfcx_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_erfcx_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_erfcx_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_erfcx_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_erfcx_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_erfcx_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_erfcx_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_erfcx_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_h_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_h_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_h_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_h_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_h_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_h_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_h_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_h_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_he_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_he_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_he_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_he_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_he_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_he_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_he_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_hermite_polynomial_he_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i0e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i0e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i0e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i0e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i0e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i0e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i0e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i0e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i0e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i0e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_i1e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_laguerre_polynomial_l_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_laguerre_polynomial_l_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_laguerre_polynomial_l_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_laguerre_polynomial_l_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_laguerre_polynomial_l_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_laguerre_polynomial_l_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_laguerre_polynomial_l_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_laguerre_polynomial_l_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_legendre_polynomial_p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_legendre_polynomial_p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_legendre_polynomial_p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_legendre_polynomial_p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_legendre_polynomial_p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_legendre_polynomial_p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_legendre_polynomial_p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_legendre_polynomial_p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_log_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_log_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_log_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_log_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_log_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_log_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_log_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_log_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtri_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtri_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtri_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtri_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtri_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtri_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtri_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_ndtri_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_scaled_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_spherical_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_spherical_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_spherical_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_spherical_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_spherical_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_spherical_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_spherical_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_spherical_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_xlog1py_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_xlog1py_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_xlog1py_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_xlog1py_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_xlog1py_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_xlog1py_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_xlog1py_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_xlog1py_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_xlog1py_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_xlog1py_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_zeta_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_zeta_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_zeta_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_zeta_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_zeta_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_zeta_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_zeta_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_special_zeta_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_list_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_split_with_sizes_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_square_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_squeeze_multiple_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_std_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_stft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_sum_to_size_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_svd_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_svd_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_svd_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_svd_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_along_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_take_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensor_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensordot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensordot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensordot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensordot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensordot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tensordot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tile_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_to_sparse_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_topk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_topk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_topk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_topk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_topk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_topk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_topk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_topk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_topk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch__scaled_mm_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__efficient_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__efficient_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__flash_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_transpose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapz_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapz_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapz_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapz_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapz_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapz_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapz_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapz_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapz_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapz_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trapz_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triangular_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triangular_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triangular_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triangular_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_tril_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_triu_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_true_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unbind_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unflatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unfold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_uniform_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_uniform_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_uniform_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_uniform_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_uniform_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_uniform_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_consecutive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_consecutive_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_consecutive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_consecutive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_consecutive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_consecutive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_consecutive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_consecutive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_consecutive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_consecutive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_uint64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unique_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unravel_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unravel_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unravel_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unravel_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unravel_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsafe_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_unsqueeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_var_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_as_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_view_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_vstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_where_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_xlogy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_xlogy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_xlogy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_xlogy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_xlogy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_xlogy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_xlogy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_xlogy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_xlogy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_xlogy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zero__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_zeros_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_embedding_bag_byte_prepack_cuda, test/test_meta.py::TestMetaCUDA::test_embedding_bag_byte_unpack_cuda, test/test_meta.py::TestMetaCUDA::test_embedding_bag_dense_backward_mode_1_cuda, test/test_meta.py::TestMetaCUDA::test_embedding_bag_dense_backward_mode_2_cuda, test/test_meta.py::TestMetaCUDA::test_embedding_bag_dense_backward_per_sample_weights_cuda, test/test_meta.py::TestMetaCUDA::test_empty_quantized_cuda, test/test_meta.py::TestMetaCUDA::test_fill__alias_relationship_cuda, test/test_meta.py::TestMetaCUDA::test_fill_stride_cuda, test/test_meta.py::TestMetaCUDA::test_group_norm_backward_output_mask0_cuda, test/test_meta.py::TestMetaCUDA::test_group_norm_backward_output_mask1_cuda, test/test_meta.py::TestMetaCUDA::test_group_norm_backward_output_mask2_cuda, test/test_meta.py::TestMetaCUDA::test_group_norm_backward_output_mask3_cuda, test/test_meta.py::TestMetaCUDA::test_group_norm_backward_output_mask4_cuda, test/test_meta.py::TestMetaCUDA::test_group_norm_backward_output_mask5_cuda, test/test_meta.py::TestMetaCUDA::test_group_norm_backward_output_mask6_cuda, test/test_meta.py::TestMetaCUDA::test_group_norm_backward_output_mask7_cuda, test/test_meta.py::TestMetaCUDA::test_huber_loss_backward_cuda, test/test_meta.py::TestMetaCUDA::test_index_select_out_cuda, test/test_meta.py::TestMetaCUDA::test_inplace_bin_ops_error_cuda, test/test_meta.py::TestMetaCUDA::test_inplace_masked_fill_error_cuda, test/test_meta.py::TestMetaCUDA::test_layer_norm_backward_output_mask0_cuda, test/test_meta.py::TestMetaCUDA::test_layer_norm_backward_output_mask1_cuda, test/test_meta.py::TestMetaCUDA::test_layer_norm_backward_output_mask2_cuda, test/test_meta.py::TestMetaCUDA::test_layer_norm_backward_output_mask3_cuda, test/test_meta.py::TestMetaCUDA::test_layer_norm_backward_output_mask4_cuda, test/test_meta.py::TestMetaCUDA::test_layer_norm_backward_output_mask5_cuda, test/test_meta.py::TestMetaCUDA::test_layer_norm_backward_output_mask6_cuda, test/test_meta.py::TestMetaCUDA::test_layer_norm_backward_output_mask7_cuda, test/test_meta.py::TestMetaCUDA::test_local_scalar_dense_call_cuda, test/test_meta.py::TestMetaCUDA::test_map_location_deserialize_cuda, test/test_meta.py::TestMetaCUDA::test_meta__fused_moving_avg_obs_fq_helper_cuda, test/test_meta.py::TestMetaCUDA::test_meta_autograd_no_error_cuda, test/test_meta.py::TestMetaCUDA::test_meta_consistency_out_dtype_mismatch_pow_Tensor_Scalar_cuda, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_H_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_T_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___getitem___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___radd___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rand___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rand___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rand___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rand___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rand___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rand___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rdiv___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmatmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmatmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmatmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmatmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmatmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmatmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmod___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmod___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmod___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmod___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmod___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmod___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmod___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmod___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmod___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rmul___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___ror___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace___ror___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___ror___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___ror___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___ror___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___ror___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rpow___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rpow___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rpow___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rpow___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rpow___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rpow___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rpow___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rpow___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rpow___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rpow___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rpow___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rsub___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rsub___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rsub___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rsub___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rsub___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rsub___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rsub___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rsub___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rsub___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rsub___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rsub___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rxor___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rxor___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rxor___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rxor___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rxor___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace___rxor___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__batch_norm_with_update_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__batch_norm_with_update_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__batch_norm_with_update_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__batch_norm_with_update_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__chunk_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcdiv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_div_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_frac_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lerp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_norm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__foreach_zero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__native_batch_norm_legit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__native_batch_norm_legit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__native_batch_norm_legit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__native_batch_norm_legit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__segment_reduce_lengths_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__segment_reduce_lengths_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__segment_reduce_lengths_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__segment_reduce_lengths_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__segment_reduce_offsets_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__segment_reduce_offsets_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__segment_reduce_offsets_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__segment_reduce_offsets_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__softmax_backward_data_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__softmax_backward_data_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__softmax_backward_data_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__softmax_backward_data_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__unsafe_masked_index_put_accumulate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace__upsample_bilinear2d_aa_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__upsample_bilinear2d_aa_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace__upsample_bilinear2d_aa_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace__upsample_bilinear2d_aa_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_acosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_decomposed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_decomposed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_decomposed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_decomposed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_decomposed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmm_decomposed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addmv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_addr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_alias_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_all_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_allclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_allclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_allclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_allclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_allclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_allclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_aminmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_aminmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_aminmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_aminmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_aminmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_aminmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_aminmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_aminmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_aminmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_aminmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_angle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_angle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_angle_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_angle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_angle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_angle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_angle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_angle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_angle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_angle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_angle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_any_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_arange_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_arange_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_arange_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_arange_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_arange_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_arange_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_arange_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_arange_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_arange_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argsort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argsort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argsort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argsort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argsort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argsort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argsort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argsort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argsort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argsort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_argwhere_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_partial_views_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_as_strided_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_asinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_1d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_2d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_atleast_3d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_baddbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_baddbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_baddbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_baddbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_baddbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_baddbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bernoulli_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bernoulli_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bernoulli_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bernoulli_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bfloat16_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bincount_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bincount_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bincount_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bincount_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bincount_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_left_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_left_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_left_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_left_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_left_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_right_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_right_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_right_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_right_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_right_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bitwise_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_block_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bool_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_shapes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_broadcast_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bucketize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bucketize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bucketize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bucketize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bucketize_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bucketize_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bucketize_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bucketize_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_bucketize_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_byte_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cartesian_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cauchy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cauchy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cauchy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cauchy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cdouble_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cfloat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chalf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_char_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_inverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_inverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_inverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_inverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cholesky_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_clone_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_column_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_combinations_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_conj_physical_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_constant_pad_nd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_contiguous_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_copysign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_copysign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_copysign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_copysign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_copysign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_copysign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_copysign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_copysign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_copysign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_copysign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_corrcoef_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_corrcoef_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_corrcoef_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_corrcoef_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_corrcoef_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_corrcoef_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_corrcoef_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_corrcoef_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_corrcoef_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_corrcoef_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_corrcoef_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_count_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cov_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cov_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cov_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cov_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cov_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cov_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cov_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cov_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cov_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cov_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cov_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cummin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumulative_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumulative_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumulative_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumulative_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumulative_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumulative_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumulative_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumulative_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumulative_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumulative_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_cumulative_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_deg2rad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_deg2rad_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_deg2rad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_deg2rad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_deg2rad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_deg2rad_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_deg2rad_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_deg2rad_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_deg2rad_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_deg2rad_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diag_embed_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagflat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diagonal_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_diff_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_digamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_digamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_digamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_digamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_digamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_digamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_digamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_digamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_digamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_digamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dist_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dist_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dist_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dist_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_floor_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_floor_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_floor_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_floor_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_floor_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_floor_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_floor_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_floor_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_floor_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_no_rounding_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_trunc_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_trunc_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_trunc_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_trunc_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_trunc_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_trunc_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_trunc_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_trunc_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_div_trunc_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_double_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_dstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_einsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_einsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_einsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_einsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_einsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_einsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_permuted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eq_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_equal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfinv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfinv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfinv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfinv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfinv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfinv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfinv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_erfinv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expand_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exponential_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exponential_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_float8_e4m3fnuz, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_float8_e5m2, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_float8_e5m2fnuz, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_eye_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_fftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_hfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ifftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_ihfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_irfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fft_rfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flip_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fliplr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_flipud_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_float_power_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_floor_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_fmod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_frexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_frexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_frexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_frexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_full_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gather_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gcd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gcd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gcd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gcd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gcd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ge_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ge_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ge_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ge_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ge_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ge_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ge_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ge_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ge_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ge_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geometric_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geometric_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geometric_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geometric_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geometric_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geometric_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geometric_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geometric_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geometric_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geqrf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geqrf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geqrf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_geqrf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gradient_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gradient_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gradient_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gradient_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gradient_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gradient_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gradient_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gradient_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gradient_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gradient_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_grid_sampler_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_grid_sampler_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_grid_sampler_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_grid_sampler_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_gt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_half_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hash_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hash_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hash_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hash_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hash_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hash_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hash_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hash_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hash_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hash_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_heaviside_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_heaviside_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_heaviside_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_heaviside_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_heaviside_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_heaviside_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_heaviside_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_heaviside_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_heaviside_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_heaviside_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_histc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_histc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_histc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_histc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_histc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_histc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_histc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hypot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hypot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hypot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_hypot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_i0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_i0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_igamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_igamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_igammac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_igammac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_imag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_imag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_imag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_index_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_inner_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_inner_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_inner_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_inner_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_inner_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_inner_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_int_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isclose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isfinite_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isnan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isneginf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isneginf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isneginf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isneginf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isneginf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isneginf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isneginf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isneginf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isneginf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isneginf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isposinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isposinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isposinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isposinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isposinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isposinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isposinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isposinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isposinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isposinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_isreal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_istft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_istft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_item_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_2inputs_2outputs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_4inputs_with_extra_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_binary_return_by_ref_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_jiterator_unary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kron_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kthvalue_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kthvalue_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kthvalue_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kthvalue_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kthvalue_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kthvalue_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kthvalue_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kthvalue_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_kthvalue_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lcm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lcm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lcm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lcm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lcm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ldexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_le_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_le_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_le_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_le_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_le_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_le_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_le_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_le_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_le_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_le_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lerp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cholesky_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cholesky_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cholesky_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cholesky_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cond_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cond_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cond_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cond_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_det_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_det_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_det_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_det_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eig_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eig_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eig_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eig_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigvalsh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigvalsh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigvalsh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_eigvalsh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_householder_product_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_householder_product_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_householder_product_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_householder_product_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_inv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_inv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_inv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_inv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_inv_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_inv_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_inv_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_inv_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_ldl_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lstsq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lstsq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lstsq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lstsq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lstsq_grad_oriented_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lstsq_grad_oriented_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lstsq_grad_oriented_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lstsq_grad_oriented_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_rank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_rank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_rank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_rank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_rank_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_rank_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_rank_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_matrix_rank_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_multi_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_multi_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_multi_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_multi_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_multi_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_multi_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_subgradients_at_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_subgradients_at_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_subgradients_at_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_norm_subgradients_at_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_singular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_singular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_singular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_pinv_singular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_slogdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_slogdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_slogdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_slogdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_triangular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_triangular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_triangular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_solve_triangular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_svdvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_svdvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_svdvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_svdvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_tensorinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_tensorinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_tensorinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_tensorinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_tensorsolve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_tensorsolve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_tensorsolve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_tensorsolve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vander_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vander_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vander_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vander_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vander_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vander_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vander_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vander_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vander_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vecdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vecdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vecdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vecdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vecdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vecdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vector_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vector_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vector_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vector_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vector_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linalg_vector_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_linspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_log_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logaddexp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logaddexp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logaddexp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logaddexp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logcumsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logcumsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logcumsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logcumsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logcumsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logcumsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logical_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_long_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_unpack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_unpack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_unpack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_lu_unpack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mH_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mT_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_std_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_std_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_std_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_std_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_std_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_var_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_var_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_var_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_var_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_masked_var_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matrix_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matrix_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matrix_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matrix_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matrix_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_matrix_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_pool2d_with_indices_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_pool2d_with_indices_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_pool2d_with_indices_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_pool2d_with_indices_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_max_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_median_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_median_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_median_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_median_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_median_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_list_of_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_meshgrid_variadic_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_min_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_movedim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_msort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_msort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_msort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_msort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_msort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_msort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_msort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_msort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_msort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_msort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_multinomial_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_multinomial_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_multinomial_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_multinomial_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nan_to_num_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nan_to_num_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nan_to_num_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nan_to_num_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nan_to_num_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nan_to_num_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nan_to_num_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nan_to_num_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nan_to_num_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nan_to_num_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmean_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmedian_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmedian_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmedian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmedian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmedian_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmedian_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmedian_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmedian_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanmedian_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanquantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nanquantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nansum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_narrow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_dropout_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_dropout_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_dropout_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_dropout_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_native_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ne_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_new_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nextafter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nextafter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nextafter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nextafter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_alpha_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_alpha_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_alpha_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_alpha_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_binary_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_binary_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_binary_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_binary_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_celu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_celu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_celu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_celu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_channel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_conv_transpose3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_embedding_loss_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_embedding_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_embedding_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_similarity_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_similarity_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_similarity_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cosine_similarity_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_ctc_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_ctc_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_elu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_elu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_elu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_elu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_embedding_bag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_embedding_bag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_embedding_bag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_embedding_bag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_embedding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_embedding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_embedding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_embedding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_feature_alpha_dropout_without_train_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_fractional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_fractional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_fractional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_fractional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_fractional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_fractional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_fractional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_fractional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_gaussian_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_gaussian_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_gaussian_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_gaussian_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_gelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_gelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_gelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_gelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_glu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_glu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_glu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_glu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_grid_sample_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_grid_sample_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_grid_sample_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_grid_sample_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_group_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_group_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_group_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_group_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardswish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardswish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardswish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardswish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardtanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardtanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardtanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardtanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardtanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardtanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardtanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hardtanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hinge_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hinge_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hinge_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_hinge_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_huber_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_huber_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_huber_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_huber_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_instance_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_instance_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_instance_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_instance_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_area_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_area_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_area_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_area_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_bicubic_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_bicubic_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_bicubic_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_bicubic_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_nearest-exact_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_trilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_trilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_trilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_interpolate_trilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_kl_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_kl_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_kl_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_kl_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_l1_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_l1_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_leaky_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_leaky_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_leaky_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_leaky_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_linear_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_linear_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_local_response_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_local_response_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_local_response_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_local_response_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_logsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_logsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_logsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_logsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_margin_ranking_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_margin_ranking_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_margin_ranking_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_margin_ranking_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_margin_ranking_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_margin_ranking_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_margin_ranking_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_margin_ranking_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool1d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool1d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool1d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool1d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool2d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool2d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool2d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool2d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool3d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool3d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool3d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_max_unpool3d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_mish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_mish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_mish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_mish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_mse_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_mse_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_mse_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_mse_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multi_head_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multi_head_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multi_head_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multi_head_attention_forward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multi_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multi_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multi_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multi_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multilabel_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multilabel_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multilabel_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multilabel_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_one_hot_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_circular_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_constant_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_reflect_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_reflect_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_reflect_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_reflect_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_reflect_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_reflect_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_reflect_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_reflect_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_reflect_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_reflect_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_reflect_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_negative_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_negative_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_negative_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_negative_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_negative_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_negative_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_negative_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_negative_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_negative_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_negative_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pad_replicate_negative_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pairwise_distance_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pairwise_distance_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pairwise_distance_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pairwise_distance_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pairwise_distance_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pairwise_distance_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pairwise_distance_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pairwise_distance_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pairwise_distance_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pairwise_distance_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pairwise_distance_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_pixel_unshuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_poisson_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_poisson_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_poisson_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_poisson_nll_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_poisson_nll_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_poisson_nll_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_poisson_nll_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_poisson_nll_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_prelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_prelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_prelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_prelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu6_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu6_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu6_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu6_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu6_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu6_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu6_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu6_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu6_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_relu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_rms_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_rms_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_rms_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_rms_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_rms_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_rms_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_rrelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_rrelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_rrelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_rrelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_selu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_selu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_selu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_selu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_silu_complex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_silu_complex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_silu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_silu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_silu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_silu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_smooth_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_smooth_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_smooth_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softmin_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softplus_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softplus_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softplus_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softplus_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_softsign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_tanhshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_tanhshrink_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_tanhshrink_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_tanhshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_tanhshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_tanhshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_tanhshrink_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_tanhshrink_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_tanhshrink_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_tanhshrink_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_tanhshrink_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_threshold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_threshold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_threshold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_threshold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_threshold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_threshold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_threshold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_threshold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_threshold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_upsample_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_upsample_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_upsample_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_upsample_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_upsample_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_upsample_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_upsample_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_upsample_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nn_functional_upsample_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_nonzero_static_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_fro_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_fro_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_fro_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_fro_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_fro_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_fro_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_inf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_inf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_inf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_inf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_inf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_inf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_nuc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_nuc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_nuc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_norm_nuc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_in_place_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_in_place_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_in_place_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_in_place_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_in_place_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_in_place_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_number_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_number_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_number_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_normal_number_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ones_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ormqr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ormqr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ormqr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ormqr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_outer_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pca_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pca_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pca_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pca_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_permute_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pinverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pinverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pinverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pinverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polar_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polar_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_3_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_4_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_4_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_4_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_4_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_4_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_4_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_4_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_4_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_4_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_polygamma_polygamma_n_4_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_positive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_quantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_quantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rad2deg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rad2deg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rad2deg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rad2deg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rad2deg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rad2deg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rad2deg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rad2deg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rad2deg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rad2deg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rand_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rand_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rand_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rand_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rand_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rand_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rand_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randint_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_randn_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_ravel_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_real_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_remainder_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_remainder_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_remainder_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_remainder_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_remainder_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_remainder_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_remainder_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_remainder_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_remainder_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_renorm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_renorm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_renorm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_renorm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_renorm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_renorm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_repeat_interleave_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_reshape_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resize_as__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_resolve_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_roll_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rot90_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_neg_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_neg_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_neg_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_round_decimals_neg_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_rsub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scalar_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_scatter_reduce_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_searchsorted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_searchsorted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_searchsorted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_searchsorted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_searchsorted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_searchsorted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_searchsorted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_searchsorted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_searchsorted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_select_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sgn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_short_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_bartlett_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_bartlett_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_blackman_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_blackman_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_gaussian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_gaussian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_general_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_general_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_general_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_general_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_hann_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_hann_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_kaiser_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_kaiser_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_nuttall_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signal_windows_nuttall_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signbit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signbit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signbit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signbit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signbit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signbit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signbit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signbit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signbit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_signbit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_slice_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sparse_mm_reduce_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sparse_mm_reduce_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sparse_mm_reduce_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sparse_mm_reduce_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sparse_sampled_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sparse_sampled_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sparse_sampled_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sparse_sampled_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_airy_ai_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_airy_ai_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_airy_ai_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_airy_ai_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_airy_ai_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_airy_ai_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_airy_ai_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_airy_ai_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_j1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_bessel_y1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_entr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_entr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_entr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_entr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_entr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_entr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_entr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_entr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_entr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_entr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_erfcx_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_erfcx_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_erfcx_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_erfcx_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_erfcx_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_erfcx_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_erfcx_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_erfcx_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_h_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_h_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_h_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_h_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_h_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_h_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_h_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_h_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_he_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_he_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_he_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_he_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_he_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_he_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_he_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_hermite_polynomial_he_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i0e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i0e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i0e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i0e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i0e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i0e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i0e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i0e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i0e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i0e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_i1e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_laguerre_polynomial_l_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_laguerre_polynomial_l_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_laguerre_polynomial_l_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_laguerre_polynomial_l_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_laguerre_polynomial_l_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_laguerre_polynomial_l_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_laguerre_polynomial_l_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_laguerre_polynomial_l_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_legendre_polynomial_p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_legendre_polynomial_p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_legendre_polynomial_p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_legendre_polynomial_p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_legendre_polynomial_p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_legendre_polynomial_p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_legendre_polynomial_p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_legendre_polynomial_p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_log_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_log_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_log_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_log_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_log_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_log_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_log_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_log_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtri_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtri_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtri_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtri_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtri_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtri_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtri_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_ndtri_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_scaled_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_spherical_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_spherical_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_spherical_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_spherical_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_spherical_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_spherical_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_spherical_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_spherical_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_xlog1py_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_xlog1py_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_xlog1py_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_xlog1py_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_xlog1py_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_xlog1py_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_xlog1py_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_xlog1py_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_xlog1py_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_xlog1py_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_zeta_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_zeta_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_zeta_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_zeta_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_zeta_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_zeta_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_zeta_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_special_zeta_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_list_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_split_with_sizes_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_square_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_squeeze_multiple_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_std_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_stft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_sum_to_size_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_svd_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_svd_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_svd_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_svd_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_along_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_take_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensor_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensordot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensordot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensordot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensordot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensordot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tensordot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tile_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_to_sparse_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_topk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_topk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_topk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_topk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_topk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_topk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_topk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_topk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_topk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch__scaled_mm_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__efficient_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__efficient_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__flash_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_torch_ops_aten__safe_softmax_default_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_transpose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapz_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapz_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapz_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapz_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapz_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapz_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapz_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapz_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapz_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapz_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trapz_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triangular_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triangular_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triangular_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triangular_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_tril_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_triu_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_true_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unbind_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unflatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unfold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_uniform_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_uniform_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_uniform_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_uniform_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_uniform_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_uniform_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_consecutive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_consecutive_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_consecutive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_consecutive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_consecutive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_consecutive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_consecutive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_consecutive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_consecutive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_consecutive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_uint64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unique_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unravel_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unravel_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unravel_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unravel_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unravel_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsafe_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_unsqueeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_var_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_as_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_view_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_vstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_where_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_xlogy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_xlogy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_xlogy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_xlogy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_xlogy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_xlogy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_xlogy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_xlogy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_xlogy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_xlogy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zero__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_inplace_zeros_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_H_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_T_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___getitem___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___radd___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rand___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rand___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rand___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rand___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rand___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rand___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rdiv___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmatmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmatmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmatmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmatmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmatmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmatmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmod___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmod___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmod___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmod___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmod___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmod___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmod___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmod___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmod___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rmul___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___ror___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace___ror___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___ror___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___ror___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___ror___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___ror___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rpow___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rpow___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rpow___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rpow___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rpow___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rpow___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rpow___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rpow___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rpow___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rpow___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rpow___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rsub___cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rsub___cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rsub___cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rsub___cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rsub___cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rsub___cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rsub___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rsub___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rsub___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rsub___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rsub___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rxor___cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rxor___cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rxor___cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rxor___cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rxor___cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace___rxor___cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__batch_norm_with_update_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__batch_norm_with_update_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__batch_norm_with_update_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__batch_norm_with_update_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__chunk_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcdiv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_div_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_frac_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lerp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_norm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__foreach_zero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__native_batch_norm_legit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__native_batch_norm_legit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__native_batch_norm_legit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__native_batch_norm_legit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__segment_reduce_lengths_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__segment_reduce_lengths_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__segment_reduce_lengths_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__segment_reduce_lengths_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__segment_reduce_offsets_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__segment_reduce_offsets_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__segment_reduce_offsets_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__segment_reduce_offsets_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__softmax_backward_data_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__softmax_backward_data_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__softmax_backward_data_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__softmax_backward_data_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__unsafe_masked_index_put_accumulate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace__upsample_bilinear2d_aa_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__upsample_bilinear2d_aa_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace__upsample_bilinear2d_aa_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace__upsample_bilinear2d_aa_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_abs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_acosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcdiv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcdiv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcdiv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcdiv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcdiv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcdiv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcmul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcmul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcmul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcmul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addcmul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_decomposed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_decomposed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_decomposed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_decomposed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_decomposed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmm_decomposed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addmv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_addr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_alias_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_all_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_allclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_allclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_allclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_allclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_allclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_allclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_aminmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_aminmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_aminmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_aminmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_aminmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_aminmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_aminmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_aminmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_aminmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_aminmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_angle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_angle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_angle_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_angle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_angle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_angle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_angle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_angle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_angle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_angle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_angle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_any_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_arange_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_arange_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_arange_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_arange_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_arange_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_arange_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_arange_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_arange_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_arange_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argsort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argsort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argsort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argsort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argsort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argsort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argsort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argsort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argsort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argsort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_argwhere_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_partial_views_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_as_strided_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_asinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_1d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_2d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_atleast_3d_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_baddbmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_baddbmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_baddbmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_baddbmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_baddbmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_baddbmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bernoulli_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bernoulli_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bernoulli_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bernoulli_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bfloat16_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bincount_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bincount_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bincount_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bincount_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bincount_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_left_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_left_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_left_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_left_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_left_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_right_shift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_right_shift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_right_shift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_right_shift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_right_shift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bitwise_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_block_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bmm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bmm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bool_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_shapes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_broadcast_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bucketize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bucketize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bucketize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bucketize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bucketize_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bucketize_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bucketize_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bucketize_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_bucketize_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_byte_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cartesian_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cauchy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cauchy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cauchy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cauchy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cdouble_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ceil_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ceil_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ceil_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ceil_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ceil_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ceil_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ceil_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ceil_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ceil_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cfloat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chalf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_char_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_inverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_inverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_inverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_inverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cholesky_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_max_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_max_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_max_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_max_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_max_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_max_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_max_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_max_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_max_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_max_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_min_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_min_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_min_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_min_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_min_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_min_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_min_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_min_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_min_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clamp_min_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_clone_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_column_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_combinations_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_conj_physical_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_constant_pad_nd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_contiguous_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_copysign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_copysign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_copysign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_copysign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_copysign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_copysign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_copysign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_copysign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_copysign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_copysign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_corrcoef_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_corrcoef_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_corrcoef_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_corrcoef_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_corrcoef_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_corrcoef_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_corrcoef_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_corrcoef_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_corrcoef_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_corrcoef_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_corrcoef_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cos_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cosh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_count_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cov_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cov_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cov_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cov_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cov_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cov_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cov_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cov_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cov_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cov_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cov_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cummin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumulative_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumulative_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumulative_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumulative_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumulative_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumulative_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumulative_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumulative_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumulative_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumulative_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_cumulative_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_deg2rad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_deg2rad_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_deg2rad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_deg2rad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_deg2rad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_deg2rad_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_deg2rad_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_deg2rad_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_deg2rad_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_deg2rad_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diag_embed_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagflat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diagonal_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_diff_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_digamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_digamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_digamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_digamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_digamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_digamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_digamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_digamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_digamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_digamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dist_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dist_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dist_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dist_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_floor_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_floor_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_floor_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_floor_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_floor_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_floor_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_floor_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_floor_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_floor_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_no_rounding_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_trunc_rounding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_trunc_rounding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_trunc_rounding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_trunc_rounding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_trunc_rounding_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_trunc_rounding_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_trunc_rounding_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_trunc_rounding_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_div_trunc_rounding_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_double_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_dstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_einsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_einsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_einsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_einsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_einsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_einsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_permuted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eq_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_equal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfinv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfinv_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfinv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfinv_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfinv_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfinv_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfinv_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_erfinv_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expand_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_expm1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exponential_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exponential_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_float8_e4m3fnuz, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_float8_e5m2, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_float8_e5m2fnuz, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_eye_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_fftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_hfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ifftshift_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_ihfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_irfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfft_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfftn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfftn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfftn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfftn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfftn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfftn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfftn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfftn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fft_rfftn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flip_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fliplr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_flipud_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_float_power_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_floor_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmax_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_fmod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_frac_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_frac_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_frac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_frac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_frexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_frexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_frexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_frexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_full_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gather_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gcd_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gcd_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gcd_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gcd_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gcd_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ge_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ge_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ge_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ge_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ge_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ge_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ge_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ge_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ge_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ge_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geometric_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geometric_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geometric_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geometric_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geometric_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geometric_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geometric_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geometric_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geometric_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geqrf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geqrf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geqrf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_geqrf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gradient_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gradient_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gradient_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gradient_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gradient_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gradient_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gradient_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gradient_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gradient_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gradient_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_grid_sampler_2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_grid_sampler_2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_grid_sampler_2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_grid_sampler_2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_gt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_half_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hash_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hash_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hash_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hash_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hash_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hash_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hash_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hash_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hash_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hash_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_heaviside_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_heaviside_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_heaviside_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_heaviside_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_heaviside_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_heaviside_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_heaviside_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_heaviside_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_heaviside_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_heaviside_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_histc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_histc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_histc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_histc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_histc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_histc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_histc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hypot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hypot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hypot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_hypot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_i0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_i0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_igamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_igamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_igammac_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_igammac_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_imag_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_imag_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_imag_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_index_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_inner_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_inner_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_inner_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_inner_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_inner_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_inner_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_int_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isclose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isfinite_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isnan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isneginf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isneginf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isneginf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isneginf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isneginf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isneginf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isneginf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isneginf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isneginf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isneginf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isposinf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isposinf_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isposinf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isposinf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isposinf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isposinf_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isposinf_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isposinf_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isposinf_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isposinf_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_isreal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_istft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_istft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_item_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_2inputs_2outputs_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_4inputs_with_extra_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_binary_return_by_ref_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_jiterator_unary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kron_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kthvalue_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kthvalue_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kthvalue_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kthvalue_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kthvalue_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kthvalue_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kthvalue_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kthvalue_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_kthvalue_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lcm_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lcm_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lcm_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lcm_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lcm_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ldexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_le_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_le_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_le_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_le_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_le_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_le_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_le_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_le_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_le_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_le_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lerp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lerp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lerp_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lerp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lerp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lerp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lerp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lgamma_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lgamma_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lgamma_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lgamma_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lgamma_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lgamma_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lgamma_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lgamma_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lgamma_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lgamma_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cholesky_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cholesky_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cholesky_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cholesky_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cholesky_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cholesky_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cholesky_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cholesky_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cond_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cond_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cond_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cond_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cross_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cross_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cross_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cross_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cross_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cross_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cross_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cross_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cross_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cross_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_cross_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_det_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_det_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_det_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_det_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_diagonal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eig_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eig_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eig_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eig_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigvalsh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigvalsh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigvalsh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_eigvalsh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_householder_product_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_householder_product_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_householder_product_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_householder_product_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_inv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_inv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_inv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_inv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_inv_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_inv_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_inv_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_inv_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_ldl_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lstsq_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lstsq_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lstsq_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lstsq_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lstsq_grad_oriented_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lstsq_grad_oriented_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lstsq_grad_oriented_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lstsq_grad_oriented_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_factor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_factor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_factor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_factor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_factor_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_factor_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_factor_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_factor_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_power_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_power_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_power_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_power_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_rank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_rank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_rank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_rank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_rank_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_rank_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_rank_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_matrix_rank_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_multi_dot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_multi_dot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_multi_dot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_multi_dot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_multi_dot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_multi_dot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_subgradients_at_zero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_subgradients_at_zero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_subgradients_at_zero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_norm_subgradients_at_zero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_hermitian_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_hermitian_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_hermitian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_hermitian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_singular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_singular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_singular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_pinv_singular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_slogdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_slogdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_slogdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_slogdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_ex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_ex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_ex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_ex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_triangular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_triangular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_triangular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_solve_triangular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_svdvals_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_svdvals_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_svdvals_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_svdvals_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_tensorinv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_tensorinv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_tensorinv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_tensorinv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_tensorsolve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_tensorsolve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_tensorsolve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_tensorsolve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vander_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vander_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vander_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vander_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vander_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vander_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vander_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vander_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vander_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vecdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vecdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vecdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vecdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vecdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vecdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vector_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vector_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vector_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vector_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vector_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linalg_vector_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_linspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log10_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log1p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_log_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logaddexp2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logaddexp2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logaddexp2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logaddexp2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logcumsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logcumsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logcumsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logcumsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logcumsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logcumsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logdet_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logdet_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logdet_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logdet_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_and_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_not_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_or_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logical_xor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_tensor_overload_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_tensor_overload_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_tensor_overload_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_tensor_overload_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_tensor_overload_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_tensor_overload_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_tensor_overload_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_tensor_overload_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_tensor_overload_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_tensor_overload_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logspace_tensor_overload_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_long_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_unpack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_unpack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_unpack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_lu_unpack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mH_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mT_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_argmin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumprod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumprod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumprod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumprod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumprod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumprod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumprod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumprod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumprod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumprod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumprod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumsum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumsum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumsum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumsum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumsum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumsum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumsum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumsum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumsum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumsum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_cumsum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_fill_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_log_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_log_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_log_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_log_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logaddexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logaddexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logaddexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logaddexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logsumexp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logsumexp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logsumexp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logsumexp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logsumexp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logsumexp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logsumexp_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logsumexp_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logsumexp_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logsumexp_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_logsumexp_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_std_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_std_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_std_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_std_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_std_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_var_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_var_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_var_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_var_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_masked_var_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matmul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matmul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matmul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matmul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matmul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matmul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matrix_exp_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matrix_exp_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matrix_exp_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matrix_exp_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matrix_exp_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_matrix_exp_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_pool2d_with_indices_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_pool2d_with_indices_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_pool2d_with_indices_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_pool2d_with_indices_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_max_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_maximum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_maximum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_maximum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_maximum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_maximum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_maximum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_maximum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_maximum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_maximum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_maximum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_median_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_median_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_median_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_median_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_median_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_median_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_median_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_median_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_median_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_list_of_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_meshgrid_variadic_tensors_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_binary_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_binary_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_binary_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_binary_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_binary_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_binary_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_binary_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_binary_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_binary_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_binary_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_no_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_no_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_no_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_no_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_no_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_no_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_no_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_no_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_no_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_no_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_with_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_with_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_with_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_with_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_with_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_with_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_with_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_with_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_with_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_min_reduction_with_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_minimum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_minimum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_minimum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_minimum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_minimum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_minimum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_minimum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_minimum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_minimum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_minimum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mode_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mode_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mode_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mode_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mode_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mode_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mode_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mode_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mode_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mode_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_movedim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_msort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_msort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_msort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_msort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_msort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_msort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_msort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_msort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_msort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_msort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mul_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_multinomial_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_multinomial_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_multinomial_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_multinomial_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mv_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mv_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mv_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mv_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mv_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mv_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nan_to_num_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nan_to_num_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nan_to_num_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nan_to_num_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nan_to_num_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nan_to_num_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nan_to_num_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nan_to_num_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nan_to_num_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nan_to_num_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmean_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmedian_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmedian_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmedian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmedian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmedian_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmedian_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmedian_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmedian_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanmedian_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanquantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nanquantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nansum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_narrow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_dropout_backward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_dropout_backward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_dropout_backward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_dropout_backward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_native_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ne_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_empty_strided_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_full_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_new_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nextafter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nextafter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nextafter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nextafter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_alpha_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_alpha_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_alpha_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_alpha_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_avg_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_batch_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_batch_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_batch_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_batch_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_binary_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_binary_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_binary_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_binary_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_celu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_celu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_celu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_celu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_channel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose1d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose1d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose1d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose2d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose2d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose2d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose3d_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose3d_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose3d_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_conv_transpose3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_embedding_loss_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_embedding_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_embedding_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_similarity_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_similarity_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_similarity_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cosine_similarity_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cross_entropy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cross_entropy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cross_entropy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_cross_entropy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_ctc_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_ctc_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_dropout_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_elu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_elu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_elu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_elu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_embedding_bag_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_embedding_bag_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_embedding_bag_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_embedding_bag_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_embedding_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_embedding_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_embedding_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_embedding_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_feature_alpha_dropout_without_train_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_fractional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_fractional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_fractional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_fractional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_fractional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_fractional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_fractional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_fractional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_gaussian_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_gaussian_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_gaussian_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_gaussian_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_gelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_gelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_gelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_gelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_glu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_glu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_glu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_glu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_grid_sample_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_grid_sample_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_grid_sample_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_grid_sample_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_group_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_group_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_group_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_group_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardswish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardswish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardswish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardswish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardtanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardtanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardtanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardtanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardtanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardtanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardtanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hardtanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hinge_embedding_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hinge_embedding_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hinge_embedding_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_hinge_embedding_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_huber_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_huber_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_huber_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_huber_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_instance_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_instance_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_instance_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_instance_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_area_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_area_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_area_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_area_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_bicubic_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_bicubic_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_bicubic_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_bicubic_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_nearest-exact_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_trilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_trilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_trilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_interpolate_trilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_kl_div_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_kl_div_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_kl_div_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_kl_div_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_l1_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_l1_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_layer_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_layer_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_layer_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_layer_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_leaky_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_leaky_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_leaky_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_leaky_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_linear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_linear_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_linear_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_linear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_linear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_linear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_local_response_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_local_response_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_local_response_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_local_response_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_logsigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_logsigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_logsigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_logsigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_margin_ranking_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_margin_ranking_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_margin_ranking_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_margin_ranking_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_margin_ranking_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_margin_ranking_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_margin_ranking_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_margin_ranking_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_pool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool1d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool1d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool1d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool1d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool1d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool1d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool1d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool1d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool2d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool2d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool2d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool2d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool2d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool2d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool2d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool2d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool3d_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool3d_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool3d_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool3d_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool3d_grad_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool3d_grad_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool3d_grad_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_max_unpool3d_grad_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_mish_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_mish_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_mish_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_mish_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_mse_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_mse_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_mse_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_mse_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multi_head_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multi_head_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multi_head_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multi_head_attention_forward_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multi_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multi_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multi_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multi_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multilabel_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multilabel_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multilabel_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multilabel_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_normalize_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_normalize_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_normalize_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_normalize_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_normalize_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_normalize_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_one_hot_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_circular_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_constant_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_reflect_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_reflect_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_reflect_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_reflect_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_reflect_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_reflect_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_reflect_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_reflect_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_reflect_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_reflect_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_reflect_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_negative_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_negative_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_negative_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_negative_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_negative_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_negative_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_negative_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_negative_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_negative_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_negative_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pad_replicate_negative_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pairwise_distance_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pairwise_distance_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pairwise_distance_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pairwise_distance_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pairwise_distance_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pairwise_distance_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pairwise_distance_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pairwise_distance_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pairwise_distance_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pairwise_distance_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pairwise_distance_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pdist_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pdist_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_shuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_pixel_unshuffle_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_poisson_nll_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_poisson_nll_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_poisson_nll_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_poisson_nll_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_poisson_nll_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_poisson_nll_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_poisson_nll_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_poisson_nll_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_prelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_prelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_prelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_prelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu6_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu6_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu6_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu6_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu6_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu6_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu6_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu6_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu6_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_relu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_rms_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_rms_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_rms_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_rms_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_rms_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_rms_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_rrelu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_rrelu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_rrelu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_rrelu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_selu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_selu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_selu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_selu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_silu_complex_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_silu_complex_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_silu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_silu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_silu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_silu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_smooth_l1_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_smooth_l1_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_smooth_l1_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_soft_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_soft_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_soft_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_soft_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softmin_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softplus_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softplus_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softplus_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softplus_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_softsign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_tanhshrink_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_tanhshrink_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_tanhshrink_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_tanhshrink_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_tanhshrink_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_tanhshrink_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_tanhshrink_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_tanhshrink_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_tanhshrink_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_tanhshrink_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_tanhshrink_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_threshold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_threshold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_threshold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_threshold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_threshold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_threshold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_threshold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_threshold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_threshold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_upsample_bilinear_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_upsample_bilinear_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_upsample_bilinear_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_upsample_bilinear_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_upsample_nearest_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_upsample_nearest_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_upsample_nearest_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_upsample_nearest_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nn_functional_upsample_nearest_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_nonzero_static_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_fro_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_fro_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_fro_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_fro_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_fro_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_fro_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_inf_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_inf_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_inf_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_inf_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_inf_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_inf_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_nuc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_nuc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_nuc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_norm_nuc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_in_place_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_in_place_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_in_place_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_in_place_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_in_place_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_in_place_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_number_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_number_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_number_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_normal_number_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ones_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ormqr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ormqr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ormqr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ormqr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_outer_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pca_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pca_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pca_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pca_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_permute_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pinverse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pinverse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pinverse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pinverse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polar_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polar_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_2_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_2_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_2_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_2_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_2_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_2_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_2_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_2_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_2_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_2_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_3_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_3_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_3_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_3_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_3_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_3_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_4_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_4_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_4_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_4_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_4_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_4_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_4_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_4_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_4_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_polygamma_polygamma_n_4_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_positive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_pow_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_put_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_qr_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_qr_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_qr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_qr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_quantile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_quantile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rad2deg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rad2deg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rad2deg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rad2deg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rad2deg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rad2deg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rad2deg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rad2deg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rad2deg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rad2deg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rand_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rand_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rand_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rand_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rand_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rand_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rand_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randint_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_randn_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_ravel_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_real_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reciprocal_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_remainder_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_remainder_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_remainder_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_remainder_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_remainder_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_remainder_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_remainder_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_remainder_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_remainder_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_renorm_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_renorm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_renorm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_renorm_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_renorm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_renorm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_repeat_interleave_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_reshape_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resize_as__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_conj_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_resolve_neg_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_roll_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rot90_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_neg_3_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_neg_3_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_neg_3_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_round_decimals_neg_3_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_rsub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scalar_tensor_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_add_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amax_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amax_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amax_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amax_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amax_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_amin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_mean_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_mean_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_mean_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_mean_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_mean_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_prod_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_prod_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_prod_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_prod_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_prod_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_prod_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_prod_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_prod_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_prod_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_scatter_reduce_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_searchsorted_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_searchsorted_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_searchsorted_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_searchsorted_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_searchsorted_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_searchsorted_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_searchsorted_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_searchsorted_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_searchsorted_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_select_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sgn_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_short_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sigmoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sign_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sign_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sign_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sign_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sign_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sign_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sign_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sign_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sign_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sign_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_bartlett_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_bartlett_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_blackman_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_blackman_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_exponential_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_exponential_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_gaussian_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_gaussian_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_general_cosine_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_general_cosine_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_general_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_general_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_hamming_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_hamming_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_hann_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_hann_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_kaiser_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_kaiser_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_nuttall_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signal_windows_nuttall_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signbit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signbit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signbit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signbit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signbit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signbit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signbit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signbit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signbit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_signbit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sin_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sinh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_scatter_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_scatter_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_scatter_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_scatter_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_scatter_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_scatter_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_scatter_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_scatter_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_scatter_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_slice_scatter_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_softmax_with_dtype_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sort_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sort_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sort_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sort_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sort_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sort_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sort_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sort_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sort_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sort_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sparse_mm_reduce_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sparse_mm_reduce_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sparse_mm_reduce_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sparse_mm_reduce_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sparse_sampled_addmm_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sparse_sampled_addmm_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sparse_sampled_addmm_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sparse_sampled_addmm_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_airy_ai_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_airy_ai_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_airy_ai_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_airy_ai_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_airy_ai_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_airy_ai_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_airy_ai_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_airy_ai_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_j1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_bessel_y1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_entr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_entr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_entr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_entr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_entr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_entr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_entr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_entr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_entr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_entr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_erfcx_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_erfcx_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_erfcx_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_erfcx_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_erfcx_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_erfcx_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_erfcx_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_erfcx_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_h_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_h_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_h_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_h_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_h_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_h_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_h_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_h_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_he_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_he_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_he_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_he_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_he_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_he_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_he_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_hermite_polynomial_he_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i0e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i0e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i0e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i0e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i0e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i0e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i0e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i0e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i0e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i0e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1e_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1e_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1e_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1e_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1e_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1e_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1e_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1e_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1e_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_i1e_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_laguerre_polynomial_l_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_laguerre_polynomial_l_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_laguerre_polynomial_l_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_laguerre_polynomial_l_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_laguerre_polynomial_l_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_laguerre_polynomial_l_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_laguerre_polynomial_l_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_laguerre_polynomial_l_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_legendre_polynomial_p_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_legendre_polynomial_p_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_legendre_polynomial_p_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_legendre_polynomial_p_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_legendre_polynomial_p_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_legendre_polynomial_p_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_legendre_polynomial_p_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_legendre_polynomial_p_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_log_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_log_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_log_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_log_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_log_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_log_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_log_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_log_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_i1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtr_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtr_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtr_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtr_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtr_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtr_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtr_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtr_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtr_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtr_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtri_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtri_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtri_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtri_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtri_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtri_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtri_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_ndtri_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k1_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k1_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k1_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k1_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k1_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k1_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k1_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_scaled_modified_bessel_k1_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_spherical_bessel_j0_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_spherical_bessel_j0_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_spherical_bessel_j0_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_spherical_bessel_j0_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_spherical_bessel_j0_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_spherical_bessel_j0_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_spherical_bessel_j0_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_spherical_bessel_j0_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_xlog1py_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_xlog1py_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_xlog1py_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_xlog1py_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_xlog1py_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_xlog1py_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_xlog1py_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_xlog1py_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_xlog1py_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_xlog1py_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_zeta_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_zeta_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_zeta_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_zeta_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_zeta_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_zeta_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_zeta_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_special_zeta_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_list_args_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_split_with_sizes_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sqrt_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_square_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_squeeze_multiple_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_std_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stft_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stft_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stft_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_stft_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sub_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_sum_to_size_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_svd_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_svd_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_svd_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_svd_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_svd_lowrank_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_svd_lowrank_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_svd_lowrank_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_svd_lowrank_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_t_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_along_dim_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_take_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tan_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tanh_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensor_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensordot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensordot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensordot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensordot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensordot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tensordot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tile_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_to_sparse_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_topk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_topk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_topk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_topk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_topk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_topk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_topk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_topk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_topk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch__scaled_mm_cuda_float8_e4m3fn, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__efficient_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__efficient_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__flash_attention_forward_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_torch_ops_aten__safe_softmax_default_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trace_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_transpose_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapezoid_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapezoid_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapezoid_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapezoid_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapezoid_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapezoid_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapezoid_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapezoid_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapezoid_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapezoid_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapezoid_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapz_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapz_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapz_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapz_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapz_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapz_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapz_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapz_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapz_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapz_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trapz_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triangular_solve_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triangular_solve_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triangular_solve_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triangular_solve_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_tril_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_indices_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_triu_indices_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_true_divide_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trunc_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trunc_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trunc_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trunc_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trunc_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trunc_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trunc_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trunc_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_trunc_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unbind_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unflatten_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unfold_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_uniform_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_uniform_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_uniform_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_uniform_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_uniform_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_uniform_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_consecutive_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_consecutive_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_consecutive_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_consecutive_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_consecutive_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_consecutive_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_consecutive_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_consecutive_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_consecutive_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_consecutive_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_uint16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_uint32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_uint64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unique_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unravel_index_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unravel_index_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unravel_index_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unravel_index_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unravel_index_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_chunk_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsafe_split_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_unsqueeze_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_mean_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_unbiased_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_unbiased_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_unbiased_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_unbiased_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_unbiased_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_var_unbiased_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vdot_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vdot_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vdot_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vdot_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vdot_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vdot_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_complex_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_complex_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_complex_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_real_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_as_real_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_copy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_view_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vsplit_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_vstack_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_where_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_xlogy_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_xlogy_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_xlogy_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_xlogy_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_xlogy_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_xlogy_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_xlogy_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_xlogy_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_xlogy_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_xlogy_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zero__cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_bfloat16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_bool, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_complex128, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_complex32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_complex64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_float16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_float32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_float64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_int16, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_int32, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_int64, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_int8, test/test_meta.py::TestMetaCUDA::test_meta_outplace_zeros_like_cuda_uint8, test/test_meta.py::TestMetaCUDA::test_nan_to_num_cuda, test/test_meta.py::TestMetaCUDA::test_nonzero_cuda, test/test_meta.py::TestMetaCUDA::test_quantized_embedding_bag_cuda, test/test_meta.py::TestMetaCUDA::test_segment_reduce_backward_cuda, test/test_meta.py::TestMetaCUDA::test_stride_for_index_Tensor_cuda, test/test_meta.py::TestMetaCUDA::test_triangular_solve_out_cuda 2025-08-14T22:32:48.3320243Z 2025-08-14T22:32:50.1179664Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:32:50.1181034Z import pkg_resources 2025-08-14T22:32:50.4989925Z Running test_decomp 6/16 ... [2025-08-14 22:32:50.498443] 2025-08-14T22:32:50.4990449Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:32:50.4992238Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=6', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:32:50.498878] 2025-08-14T22:36:40.9924482Z 2025-08-14T22:36:40.9925585Z test_decomp 6/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_6.16_9cd5d861c9a0acea_.log 2025-08-14T22:36:41.0076044Z Running 543 items in this shard: test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmod___cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmul___cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rsub___cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rxor___cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive__chunk_cat_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive__upsample_bilinear2d_aa_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_abs_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_abs_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcdiv_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmv_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmv_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_allclose_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_aminmax_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_angle_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_any_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_arange_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_arange_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_partial_views_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_2d_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_baddbmm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bernoulli_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_and_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_or_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_or_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_byte_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_byte_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cat_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ceil_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ceil_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_max_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_max_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_constant_pad_nd_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_constant_pad_nd_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_constant_pad_nd_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_contiguous_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_corrcoef_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cos_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cos_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cos_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumprod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumsum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_deg2rad_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_scatter_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_scatter_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dist_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_trunc_rounding_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dsplit_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_permuted_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfinv_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_divide_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_like_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geometric_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_add_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_add_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amax_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amin_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_inner_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isclose_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_unary_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_unary_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lgamma_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lgamma_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cholesky_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cond_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cond_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eig_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_inv_ex_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lu_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lu_solve_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_norm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_norm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_power_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_qr_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_svd_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_tensor_overload_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log1p_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log1p_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_xor_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logit_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logsumexp_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lu_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lu_solve_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mT_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mT_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumprod_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_mean_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_std_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_sum_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_sum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_maximum_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mean_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_no_dim_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_no_dim_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_multinomial_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmedian_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ne_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_neg_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_strided_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_ones_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool3d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_avg_pool1d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_avg_pool3d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout3d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_elu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_elu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_fractional_max_pool3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_glu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_group_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardtanh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_kl_div_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_l1_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool2d_grad_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_constant_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_smooth_l1_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_unfold_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ormqr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pca_lowrank_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_positive_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pow_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_put_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_put_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rand_like_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ravel_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_remainder_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_renorm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsub_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsub_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_mean_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_prod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_prod_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_sum_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_searchsorted_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_scatter_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sgn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signbit_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signbit_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sort_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sparse_sampled_addmm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j0_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j1_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_t_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_u_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_erfcx_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1e_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtr_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_zeta_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stack_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_unbiased_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sub_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tan_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tanh_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_topk_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__efficient_attention_forward_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapezoid_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapz_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_indices_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_unbiased_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_unbiased_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_xlogy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_decomposed_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_aminmax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_aminmax_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_asinh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_atan_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_and_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_not_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_or_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_bucketize_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_cat_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_cat_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_ceil_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_clone_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_clone_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_copysign_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_copysign_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_logsumexp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_sinc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_take_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_xlogy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_deg2rad_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_diag_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_scatter_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_div_floor_rounding_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_empty_like_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_empty_strided_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_erf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_erfc_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_erfinv_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_float8_e4m3fnuz, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft2_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft2_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifftn_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfftn_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_floor_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_floor_divide_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_floor_divide_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_floor_divide_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_fmax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fmod_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_frac_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_full_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_ge_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_gt_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_gt_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_igamma_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_igamma_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_index_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_index_fill_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_isin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_isin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_isnan_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_isposinf_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_isposinf_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_item_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_item_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_le_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_le_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_le_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_lerp_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_cross_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_diagonal_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_logical_not_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_logical_not_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_logsumexp_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_masked_fill_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_masked_fill_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_list_of_tensors_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_variadic_tensors_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_mul_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_mul_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_mv_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_nan_to_num_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_narrow_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_narrow_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_native_batch_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_ne_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_embedding_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_embedding_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_glu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardsigmoid_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_max_unpool2d_grad_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu6_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_softplus_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_softshrink_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_ones_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_ones_like_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_permute_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_permute_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_renorm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_round_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_round_decimals_3_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_rsqrt_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_select_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_slice_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_slice_scatter_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_special_log_ndtr_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_special_zeta_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_split_list_args_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_std_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_std_mean_unbiased_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_std_unbiased_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_sub_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_sum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_trunc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_uniform_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_vdot_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_vdot_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_like_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_like_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_LSTM_train_mode_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_RNN_eval_mode_cuda_float32, test/test_decomp.py::DecompOneOffTestsCUDA::test_exponential_non_inf_cuda 2025-08-14T22:36:41.0219642Z 2025-08-14T22:36:42.7435767Z Uploading artifacts took 1.75 seconds 2025-08-14T22:36:45.0921707Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:36:45.0924470Z import pkg_resources 2025-08-14T22:36:45.4738342Z Running test_decomp 10/16 ... [2025-08-14 22:36:45.473361] 2025-08-14T22:36:45.4738712Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:36:45.4741532Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=10', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:36:45.473799] 2025-08-14T22:38:16.4868006Z 2025-08-14T22:38:16.4869194Z test_decomp 10/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_10.16_8e4ad3c724df4afa_.log 2025-08-14T22:38:16.5051078Z Running 570 items in this shard: test/test_decomp.py::TestDecompCUDA::test_bernoulli_default_cuda, test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rdiv___cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmatmul___cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmul___cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rsub___cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive__segment_reduce_offsets_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcdiv_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmm_decomposed_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_alias_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_allclose_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amax_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amin_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_aminmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_angle_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_any_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_any_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_arange_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmax_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argsort_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asinh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asinh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_3d_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_not_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_block_diag_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_tensors_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_tensors_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bucketize_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bucketize_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_byte_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_byte_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cdouble_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cholesky_solve_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_max_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_max_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_combinations_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_constant_pad_nd_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_contiguous_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_copysign_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_corrcoef_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_count_nonzero_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cross_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummax_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_deg2rad_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dist_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_floor_rounding_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_like_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_equal_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erf_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfc_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfinv_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfinv_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_as_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fill_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flatten_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flip_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmin_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geometric_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geqrf_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geqrf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gradient_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hash_tensor_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_heaviside_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_histc_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hsplit_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hsplit_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hstack_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_i0_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_i0_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_i0_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_igammac_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_imag_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_imag_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_add_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_add_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_prod_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_prod_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_select_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isclose_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isfinite_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isnan_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isnan_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isneginf_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_istft_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kthvalue_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_le_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_le_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lerp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lgamma_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cross_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigvals_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigvalsh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_inv_ex_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lstsq_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lstsq_grad_oriented_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lu_factor_ex_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_rank_hermitian_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_pinv_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_pinv_hermitian_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_slogdet_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vecdot_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vecdot_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_tensor_overload_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log2_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_with_dtype_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_and_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_xor_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_tensor_overload_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logsumexp_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lt_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logsumexp_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_normalize_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_softmin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_sum_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_binary_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_maximum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_median_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_binary_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_minimum_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mul_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_multinomial_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nan_to_num_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_native_dropout_backward_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_neg_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_strided_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_zeros_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool1d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv3d_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_ctc_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_embedding_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_embedding_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_fractional_max_pool2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hinge_embedding_loss_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_huber_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_bicubic_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool2d_grad_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_mse_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multi_margin_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_circular_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_constant_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_poisson_nll_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu6_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu6_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_silu_complex_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softshrink_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_nearest_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_nearest_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_number_mean_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ormqr_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_put_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_like_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_like_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randn_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reciprocal_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_remainder_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_renorm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize__cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize_as__cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_conj_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_roll_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsqrt_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_add_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_add_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amax_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_prod_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_scatter_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sgn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sign_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sign_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_bartlett_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_general_hamming_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_scatter_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_with_dtype_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sort_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sparse_sampled_addmm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j0_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y0_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y1_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_v_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_entr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_h_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i0e_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i0e_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i1_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k1_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_zeta_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_zeta_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sub_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sub_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tan_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_sparse_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_sparse_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_topk_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapezoid_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_indices_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trunc_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_unbiased_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_unbiased_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_unbiased_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_real_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zero__cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick__chunk_cat_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_acosh_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_addcmul_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_addcmul_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_decomposed_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_amax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_amax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_amin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_atan_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_atan_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_baddbmm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_not_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_ceil_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_max_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_max_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_max_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_masked_fill_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_nn_functional_max_unpool2d_grad_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_trace_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_cosh_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_cosh_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_cumprod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_cumsum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_scatter_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_empty_like_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_empty_strided_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_erfc_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_expand_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_exponential_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_floor_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fmax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_frexp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_ge_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_ge_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_geometric_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_grid_sampler_2d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_hypot_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_isin_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_isneginf_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_isneginf_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_le_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_le_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_diagonal_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_tensor_overload_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_log1p_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_log_normal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_logaddexp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_logical_and_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_logical_and_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_logit_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_tensor_overload_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_tensor_overload_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_mean_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_narrow_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_native_dropout_backward_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_neg_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_neg_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_elu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardsigmoid_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_max_unpool3d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_max_unpool3d_grad_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu6_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu6_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_rrelu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_softshrink_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_norm_nuc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_normal_number_mean_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_permute_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_permute_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_permute_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_polar_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_pow_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_pow_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_remainder_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_repeat_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_roll_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_round_decimals_0_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_round_decimals_neg_3_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_rsqrt_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_select_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_sinc_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_slice_scatter_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_softmax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_special_erfcx_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_special_i0e_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_special_i0e_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1e_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtri_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_special_zeta_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_split_list_args_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_std_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_std_mean_unbiased_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_std_unbiased_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_triu_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_uniform_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_uniform_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_unsafe_split_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_var_mean_unbiased_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_var_unbiased_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_view_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_view_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_like_cuda_complex32, test/test_decomp.py::DecompOneOffTestsCUDA::test_native_layer_norm_cpu_decomp_cuda 2025-08-14T22:38:16.5204297Z 2025-08-14T22:38:20.5625918Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:38:20.5627276Z import pkg_resources 2025-08-14T22:38:20.9453755Z Running test_decomp 14/16 ... [2025-08-14 22:38:20.944885] 2025-08-14T22:38:20.9454272Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:38:20.9457230Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=14', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:38:20.945328] 2025-08-14T22:41:21.6473806Z 2025-08-14T22:41:21.6475102Z test_decomp 1/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_1.16_5087982246a9bfca_.log 2025-08-14T22:41:21.6740264Z Running 579 items in this shard: test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rand___cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rand___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmod___cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmul___cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rsub___cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_add_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmm_decomposed_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amax_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_angle_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_angle_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_any_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argsort_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_partial_views_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asin_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan2_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_3d_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_baddbmm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_baddbmm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bincount_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_and_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_left_shift_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_left_shift_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_or_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_right_shift_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_xor_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_block_diag_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bmm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_tensors_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cat_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cat_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cdouble_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ceil_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_max_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_combinations_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_combinations_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_contiguous_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_count_nonzero_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cov_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cross_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_deg2rad_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_no_rounding_mode_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_no_rounding_mode_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_double_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_double_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dsplit_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dsplit_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_einsum_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_einsum_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_like_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_like_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_equal_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fill_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flip_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_frac_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_frexp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gcd_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geometric_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hash_tensor_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_i0_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_mean_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_prod_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_select_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_int_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_unary_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lcm_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_le_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lerp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cholesky_ex_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cross_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_ldl_factor_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lu_factor_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lu_solve_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_rank_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_norm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_norm_subgradients_at_zero_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_norm_subgradients_at_zero_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_pinv_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_slogdet_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_svdvals_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_tensorinv_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vander_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log1p_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_with_dtype_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logaddexp2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logaddexp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logdet_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_or_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_xor_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logit_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logit_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logit_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lt_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mH_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmax_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_median_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_normalize_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_scatter_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_softmin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_std_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_matmul_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_binary_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_no_dim_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_with_dim_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_no_dim_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_no_dim_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_with_dim_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mode_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_msort_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mul_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mv_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_native_batch_norm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ne_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_neg_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_full_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_full_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_full_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_ones_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_ones_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_ones_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_ones_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool2d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cross_entropy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_elu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_gaussian_nll_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_glu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_grid_sample_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardswish_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardswish_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_instance_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_area_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_area_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_bicubic_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_bilinear_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_linear_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_l1_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_layer_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_local_response_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool2d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool3d_grad_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multilabel_margin_loss_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multilabel_margin_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_normalize_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_circular_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu6_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_tanhshrink_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_threshold_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_unfold_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_bilinear_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_fro_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_inf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_in_place_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ormqr_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_outer_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_outer_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_positive_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pow_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_put_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_put_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rad2deg_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rand_like_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rand_like_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reciprocal_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_remainder_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_conj_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_roll_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_decimals_0_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_mean_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_prod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_prod_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_sum_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_searchsorted_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_scatter_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_short_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sign_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sort_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j0_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y0_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y0_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_v_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_log_ndtr_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_log_ndtr_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i0_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i0_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i0_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i1_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k0_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k0_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k0_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_list_args_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_square_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_square_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stack_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stft_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sub_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_along_dim_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tanh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tanh_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tile_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_sparse_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_sparse_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapz_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trunc_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_consecutive_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_consecutive_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_unbiased_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_unbiased_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vdot_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_where_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_where_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zero__cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_like_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_like_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick__chunk_cat_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick__upsample_bilinear2d_aa_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_addmv_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_addr_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_addr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_amax_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_amin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_arange_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_asinh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_atan_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_atanh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_baddbmm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_or_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_cauchy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_ceil_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_max_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_max_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_copysign_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_diag_embed_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_logaddexp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_native_dropout_backward_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_nn_functional_hardshrink_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_norm_fro_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_unfold_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_unsqueeze_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cosh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_cumsum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_scatter_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_dist_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_dist_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_div_trunc_rounding_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_div_trunc_rounding_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_dot_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_exp2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_exp_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_exp_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_exponential_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft2_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfftn_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft2_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft2_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_flip_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_floor_divide_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_fmax_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_hypot_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_index_fill_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_isnan_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_isnan_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_isnan_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_cross_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_tensor_overload_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_log1p_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_log2_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_log2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_log_softmax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_logaddexp2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_logaddexp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_lt_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_masked_fill_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_maximum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_variadic_tensors_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_mv_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nan_to_num_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nan_to_num_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_nan_to_num_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_narrow_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_native_layer_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_embedding_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardtanh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_max_unpool2d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_norm_fro_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_norm_fro_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_ones_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_ones_like_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_rad2deg_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_round_decimals_0_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_select_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_select_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_select_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_sign_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_signbit_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_slice_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_slice_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_softmax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_special_entr_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_special_erfcx_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_special_erfcx_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1e_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_special_log_ndtr_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_split_list_args_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_std_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_t_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_triu_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_trunc_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_var_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_var_mean_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_like_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_like_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_like_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_LSTM_train_mode_cuda_float64, test/test_decomp.py::DecompOneOffTestsCUDA::test_sdpa_nn_functional_scaled_dot_product_attention_cuda_bfloat16 2025-08-14T22:41:21.6961398Z 2025-08-14T22:41:25.8061866Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:41:25.8063298Z import pkg_resources 2025-08-14T22:41:26.1891583Z Running test_decomp 15/16 ... [2025-08-14 22:41:26.188668] 2025-08-14T22:41:26.1892106Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:41:26.1894928Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '-m', 'not serial', '--shard-id=15', '--num-shards=16', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:41:26.189106] 2025-08-14T22:41:29.6185545Z 2025-08-14T22:41:29.6186470Z test_decomp 14/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_14.16_9a19e0d9365fd668_.log 2025-08-14T22:41:29.6350256Z Running 598 items in this shard: test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmatmul___cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmul___cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___ror___cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rsub___cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rxor___cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive__batch_norm_with_update_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__segment_reduce_lengths_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive__segment_reduce_offsets_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__segment_reduce_offsets_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__upsample_bilinear2d_aa_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_add_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addbmm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmm_decomposed_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addr_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addr_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_allclose_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_aminmax_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_arange_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmin_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argsort_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argsort_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_partial_views_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asin_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_baddbmm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bincount_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_not_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_tensors_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cauchy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cdouble_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_constant_pad_nd_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_contiguous_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_copysign_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_corrcoef_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cos_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cross_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummax_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_floor_rounding_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_floor_rounding_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_trunc_rounding_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_trunc_rounding_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_double_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_like_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_permuted_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_permuted_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_permuted_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erf_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfinv_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exponential_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_float8_e4m3fn, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft2_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flatten_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flip_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_divide_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_divide_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmin_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_frexp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_like_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geometric_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geometric_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gradient_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hsplit_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hstack_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hstack_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hstack_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hypot_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hypot_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_i0_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_fill_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_mean_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_int_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_int_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isfinite_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isfinite_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isin_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isnan_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isneginf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kron_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lcm_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lcm_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_le_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eig_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_inv_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_inv_ex_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lu_solve_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_norm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_tensorinv_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vander_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_tensor_overload_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_tensor_overload_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log1p_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log1p_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log2_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log2_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_with_dtype_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logcumsumexp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_and_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_or_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_or_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logit_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logit_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_tensor_overload_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amax_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amin_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumprod_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logsumexp_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_std_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_matmul_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_binary_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_pool2d_with_indices_backward_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_with_dim_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_with_dim_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_with_dim_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_maximum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mean_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_binary_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_with_dim_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_msort_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_msort_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mul_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nan_to_num_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmedian_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_native_layer_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_neg_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_full_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nextafter_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool1d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_max_pool3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_avg_pool1d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_similarity_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_with_train_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_fractional_max_pool3d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_bicubic_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_trilinear_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_layer_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_leaky_relu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_logsigmoid_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool1d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool1d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_mish_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multilabel_margin_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_nll_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_one_hot_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_circular_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_circular_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu6_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_rrelu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_selu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_tanhshrink_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_fro_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_in_place_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_outer_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polar_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_positive_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_positive_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_put_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_put_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randn_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randn_like_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize__cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize_as__cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_conj_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_roll_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsqrt_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_add_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_add_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amax_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_mean_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_prod_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_searchsorted_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sgn_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sgn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sgn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sgn_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_short_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_short_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sign_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_gaussian_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signbit_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_with_dtype_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j0_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j0_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j0_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_t_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_u_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_entr_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_erfcx_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i0e_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1e_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_zeta_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_zeta_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_list_args_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_unbiased_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sub_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_svd_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_sparse_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_topk_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_topk_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__flash_attention_forward_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapezoid_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapz_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_indices_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trunc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_uniform_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_consecutive_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unravel_index_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_unbiased_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vdot_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vstack_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vstack_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_xlogy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zero__cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_like_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick__batch_norm_with_update_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick__chunk_cat_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick__chunk_cat_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick__chunk_cat_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick__upsample_bilinear2d_aa_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_addcmul_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_addr_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_amax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_aminmax_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_aminmax_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_any_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_any_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_asin_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_asin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_asinh_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_asinh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_asinh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_atan_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_right_shift_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_cat_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_max_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_clone_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_deg2rad_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_norm_nuc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_split_with_sizes_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_squeeze_multiple_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_std_unbiased_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_cumsum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_deg2rad_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_scatter_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_digamma_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_digamma_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_dist_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_div_trunc_rounding_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_dot_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_empty_strided_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_empty_strided_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_eq_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_erf_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_erfc_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_erfinv_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_erfinv_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_exp2_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_exp2_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_exp_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_expand_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_expand_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifftn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifftn_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft2_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft2_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_floor_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_floor_divide_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fmin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fmod_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fmod_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_frexp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_full_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_heaviside_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_index_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_index_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_index_fill_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_isin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_isnan_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_isnan_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_isposinf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_isposinf_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_item_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_item_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_lcm_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_le_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_le_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_lerp_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_lerp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_lgamma_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_lgamma_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_diagonal_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_tensor_overload_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_log1p_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_log1p_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_log2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_log_normal_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_log_softmax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_logaddexp_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_logaddexp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_logical_and_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_logical_not_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_logical_not_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_logsumexp_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_masked_fill_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_mean_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_list_of_tensors_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_list_of_tensors_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_narrow_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_narrow_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_native_dropout_backward_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_native_layer_norm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_ne_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_binary_cross_entropy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_leaky_relu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_max_unpool2d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_max_unpool2d_grad_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_max_unpool3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_prelu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_prelu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu6_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_rrelu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_norm_fro_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_ones_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_permute_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_rad2deg_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_rad2deg_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_randn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_remainder_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_remainder_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_repeat_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_repeat_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_roll_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_roll_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_rot90_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_rot90_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_round_decimals_3_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_rsqrt_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_signbit_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_sinc_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_special_entr_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_special_i0e_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1e_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1e_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_std_unbiased_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_sub_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_sum_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_t_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_unsafe_split_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_var_mean_unbiased_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_like_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_uniform_cuda, test/test_decomp.py::HasDecompTest::test_has_decomposition 2025-08-14T22:41:29.6524681Z 2025-08-14T22:41:33.7171218Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:41:33.7172548Z import pkg_resources 2025-08-14T22:41:34.0985015Z Running test_binary_ufuncs 1/1 ... [2025-08-14 22:41:34.097921] 2025-08-14T22:41:34.0985550Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:41:34.0986745Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_binary_ufuncs.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:41:34.098321] 2025-08-14T22:41:34.8700085Z 2025-08-14T22:41:34.8703845Z test_decomp 15/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_15.16_30431feb9eab4ba1_.log 2025-08-14T22:41:34.8844405Z Running 524 items in this shard: test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmod___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmul___cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___ror___cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rsub___cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rxor___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rxor___cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive__batch_norm_with_update_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_abs_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addbmm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmm_decomposed_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addr_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_aminmax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_aminmax_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_arange_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmax_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argsort_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan2_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_2d_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_3d_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_not_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_block_diag_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bucketize_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cdist_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_combinations_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cov_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cross_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummax_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummax_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummin_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumprod_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumprod_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumsum_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumsum_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_deg2rad_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_scatter_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dist_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_floor_rounding_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_floor_rounding_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_no_rounding_mode_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_no_rounding_mode_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_double_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_like_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_permuted_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfc_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfinv_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp2_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft2_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft2_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fill_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fill_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flatten_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fliplr_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_power_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_divide_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_divide_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_divide_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmax_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmax_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_frac_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_like_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gcd_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hash_tensor_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_heaviside_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_i0_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_add_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_fill_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_fill_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_mean_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_select_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_inner_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isfinite_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isnan_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isneginf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isposinf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_unary_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kron_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ldexp_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cross_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eig_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lstsq_grad_oriented_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lu_solve_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_norm_subgradients_at_zero_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_tensorsolve_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vecdot_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_tensor_overload_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_normal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_normal_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_with_dtype_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logcumsumexp_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logit_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lt_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lu_unpack_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumprod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logaddexp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logsumexp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_scatter_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_matmul_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_pool2d_with_indices_backward_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_msort_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mul_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_multinomial_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mv_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmedian_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nansum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_strided_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_full_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_full_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_zeros_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_zeros_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool2d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_max_pool1d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_avg_pool3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_binary_cross_entropy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_celu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv1d_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv2d_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose3d_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose3d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout2d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_embedding_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_fractional_max_pool2d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardsigmoid_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hinge_embedding_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_instance_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_kl_div_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_linear_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_logsigmoid_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multi_margin_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_nll_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pdist_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_rrelu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_selu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_silu_complex_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_unfold_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_nuc_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_nuc_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_nuc_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_nuc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_in_place_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_outer_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_positive_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_qr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rad2deg_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_like_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reciprocal_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_renorm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize__cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize__cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize__cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_roll_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_decimals_3_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_add_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amax_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_prod_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_scatter_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_scatter_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_short_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_kaiser_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_kaiser_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signbit_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signbit_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_with_dtype_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sort_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y1_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y1_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_u_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_h_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1e_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1e_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_log_ndtr_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_log_ndtr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_log_ndtr_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i1_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtr_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtri_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_list_args_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_square_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sub_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_along_dim_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vstack_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vstack_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick__chunk_cat_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick__chunk_cat_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick__native_batch_norm_legit_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick__softmax_backward_data_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_addcdiv_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_addcmul_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_addmv_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_alias_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_aminmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_asin_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_atan_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_atan_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_atanh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_baddbmm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_baddbmm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_or_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_right_shift_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_xor_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_cat_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_alias_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_std_mean_unbiased_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cosh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_deg2rad_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_diag_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_diag_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_scatter_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_erf_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_erfc_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_erfinv_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_exp2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_expm1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_exponential_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfftn_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_flip_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fmin_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fmin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_full_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_full_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_ge_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_i0_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_igammac_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_index_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_isneginf_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_isneginf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_isneginf_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_lgamma_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_diagonal_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_diagonal_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_vector_norm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_vector_norm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_logaddexp2_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_logical_and_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_logical_not_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_logit_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_tensor_overload_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_lt_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_lt_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_masked_fill_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_maximum_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_list_of_tensors_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_variadic_tensors_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_mul_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_narrow_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_ne_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_neg_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_elu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardtanh_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_leaky_relu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_mse_loss_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_silu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_softplus_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_softshrink_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_unfold_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_norm_fro_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_norm_inf_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_normal_number_mean_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_ones_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_permute_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_permute_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_permute_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_pow_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_pow_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_pow_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_rad2deg_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_randn_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_rot90_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_round_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_sign_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_sinc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_special_entr_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_special_i0e_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1e_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1e_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_special_zeta_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_split_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_split_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_split_list_args_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_std_mean_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_sub_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_sub_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_t_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_t_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_t_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_take_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_tril_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_triu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_trunc_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_var_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_var_mean_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_vdot_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_vdot_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_view_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_GRU_eval_mode_cuda_float32, test/test_decomp.py::DecompOneOffTestsCUDA::test_rms_norm_decomp_cuda_cuda, test/test_decomp.py::HasDecompTest::test_aten_core_operators 2025-08-14T22:41:34.8982312Z 2025-08-14T22:41:38.9351214Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:41:38.9352866Z import pkg_resources 2025-08-14T22:41:39.3078724Z Running test_matmul_cuda 1/1 ... [2025-08-14 22:41:39.307369] 2025-08-14T22:41:39.3079269Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:41:39.3080934Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_matmul_cuda.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:41:39.307758] 2025-08-14T22:41:46.4863422Z 2025-08-14T22:41:46.4873080Z test_matmul_cuda 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_matmul_cuda_1.1_6f15f34674744d0e_.log 2025-08-14T22:41:46.5618519Z Running 1604 items in this shard: test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_1_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_32_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_1_N_64_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_1_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_32_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_32_N_64_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_1_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_32_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float16_M_64_N_64_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_1_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_32_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_1_N_64_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_1_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_32_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_32_N_64_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_1_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_32_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_1_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_1_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_32_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_32_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_64_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_addmm_baddmm_dtype_overload_float32_M_64_N_64_K_64_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_alignment_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_no_reduced_precision_small_size_4_size_32768_backend_cublas_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_no_reduced_precision_small_size_4_size_32768_backend_cublaslt_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_no_reduced_precision_small_size_8_size_32768_backend_cublas_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_no_reduced_precision_small_size_8_size_32768_backend_cublaslt_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_10000_backend_cublas_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_10000_backend_cublas_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_10000_backend_cublaslt_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_10000_backend_cublaslt_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_1000_backend_cublas_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_1000_backend_cublas_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_1000_backend_cublaslt_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_1000_backend_cublaslt_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_100_backend_cublas_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_100_backend_cublas_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_100_backend_cublaslt_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_fp16_accumulate_size_100_backend_cublaslt_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_10000_backend_cublas_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_10000_backend_cublas_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_10000_backend_cublaslt_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_10000_backend_cublaslt_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_1000_backend_cublas_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_1000_backend_cublas_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_1000_backend_cublaslt_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_1000_backend_cublaslt_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_100_backend_cublas_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_100_backend_cublas_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_100_backend_cublaslt_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_reduced_precision_size_100_backend_cublaslt_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_10000_backend_cublas_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_10000_backend_cublas_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_10000_backend_cublas_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_10000_backend_cublaslt_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_10000_backend_cublaslt_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_10000_backend_cublaslt_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_1000_backend_cublas_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_1000_backend_cublas_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_1000_backend_cublas_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_1000_backend_cublaslt_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_1000_backend_cublaslt_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_1000_backend_cublaslt_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_100_backend_cublas_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_100_backend_cublas_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_100_backend_cublas_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_100_backend_cublaslt_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_100_backend_cublaslt_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_addmm_size_100_backend_cublaslt_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_and_lt_reduced_precision_fp16_accumulate_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_1_10000_10000_10000_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_1_10000_10000_10000_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_1_10000_10000_10000_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_1_10000_1000_10000_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_1_10000_1000_10000_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_1_10000_1000_10000_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_2_1000_1000_1000_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_2_1000_1000_1000_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_2_1000_1000_1000_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_2_100_100_100_cuda_bfloat16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_2_100_100_100_cuda_float16, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_cublas_baddbmm_large_input_2_100_100_100_cuda_float32, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_fp16_accum_and_fp32_out_failure_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_fp16_accum_and_fp32_out_failure_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_fp16_accum_and_fp32_out_failure_batch_size_32_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_fp16_accum_and_fp32_out_failure_batch_size_32_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_False_a_row_major_False_b_row_major_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_False_a_row_major_False_b_row_major_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_False_a_row_major_True_b_row_major_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_False_a_row_major_True_b_row_major_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_True_a_row_major_False_b_row_major_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_True_a_row_major_False_b_row_major_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_True_a_row_major_True_b_row_major_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_2d_strided_True_a_row_major_True_b_row_major_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_False_a_row_major_False_b_row_major_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_False_a_row_major_False_b_row_major_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_False_a_row_major_True_b_row_major_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_False_a_row_major_True_b_row_major_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_True_a_row_major_False_b_row_major_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_True_a_row_major_False_b_row_major_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_True_a_row_major_True_b_row_major_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_2d_3d_strided_True_a_row_major_True_b_row_major_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_False_a_row_major_False_b_row_major_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_False_a_row_major_False_b_row_major_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_False_a_row_major_True_b_row_major_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_False_a_row_major_True_b_row_major_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_True_a_row_major_False_b_row_major_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_True_a_row_major_False_b_row_major_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_True_a_row_major_True_b_row_major_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_2d_strided_True_a_row_major_True_b_row_major_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_False_a_row_major_False_b_row_major_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_False_a_row_major_False_b_row_major_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_False_a_row_major_True_b_row_major_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_False_a_row_major_True_b_row_major_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_True_a_row_major_False_b_row_major_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_True_a_row_major_False_b_row_major_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_True_a_row_major_True_b_row_major_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_3d_3d_strided_True_a_row_major_True_b_row_major_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/2d_a_row_major_False_b_row_major_False_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/2d_a_row_major_False_b_row_major_False_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/2d_a_row_major_False_b_row_major_True_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/2d_a_row_major_False_b_row_major_True_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/2d_a_row_major_True_b_row_major_False_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/2d_a_row_major_True_b_row_major_False_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/2d_a_row_major_True_b_row_major_True_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/2d_a_row_major_True_b_row_major_True_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/3d_a_row_major_False_b_row_major_False_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/3d_a_row_major_False_b_row_major_False_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/3d_a_row_major_False_b_row_major_True_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/3d_a_row_major_False_b_row_major_True_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/3d_a_row_major_True_b_row_major_False_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/3d_a_row_major_True_b_row_major_False_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/3d_a_row_major_True_b_row_major_True_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_2d/3d_a_row_major_True_b_row_major_True_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/2d_a_row_major_False_b_row_major_False_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/2d_a_row_major_False_b_row_major_False_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/2d_a_row_major_False_b_row_major_True_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/2d_a_row_major_False_b_row_major_True_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/2d_a_row_major_True_b_row_major_False_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/2d_a_row_major_True_b_row_major_False_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/2d_a_row_major_True_b_row_major_True_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/2d_a_row_major_True_b_row_major_True_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/3d_a_row_major_False_b_row_major_False_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/3d_a_row_major_False_b_row_major_False_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/3d_a_row_major_False_b_row_major_True_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/3d_a_row_major_False_b_row_major_True_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/3d_a_row_major_True_b_row_major_False_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/3d_a_row_major_True_b_row_major_False_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/3d_a_row_major_True_b_row_major_True_max_autotune_False_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_grouped_gemm_compiled_op_3d/3d_a_row_major_True_b_row_major_True_max_autotune_True_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_1_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_32_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_bfloat16_M_64_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_1_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_32_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float16_M_64_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_1_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_32_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_1_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_32_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_1_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_1_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_1_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_1_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_1_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_1_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_32_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_32_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_32_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_32_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_32_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_32_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_64_batch_size0_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_64_batch_size0_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_64_batch_size_16_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_64_batch_size_16_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_64_batch_size_1_backend_cublas_cuda, test/test_matmul_cuda.py::TestMatmulCudaCUDA::test_mm_bmm_dtype_overload_float32_M_64_N_64_K_64_batch_size_1_backend_cublaslt_cuda, test/test_matmul_cuda.py::TestMixedDtypesLinearCudaCUDA::test_mixed_dtypes_linear_cuda_bfloat16, test/test_matmul_cuda.py::TestMixedDtypesLinearCudaCUDA::test_mixed_dtypes_linear_cuda_float16, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_compile_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_error_messages_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_error_messages_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_1023_64_48_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_1023_64_48_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_1025_128_96_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_1025_128_96_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_127_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_127_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_128_128_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_128_128_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_128_256_512_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_128_256_512_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_197_224_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_197_224_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_197_240_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_197_240_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_256_256_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_256_256_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_256_512_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_256_512_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_2_1024_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_2_1024_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_31_1024_64_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_31_1024_64_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_45_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_45_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_512_128_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_512_128_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_65_96_112_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_False_65_96_112_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_1023_64_48_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_1023_64_48_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_1025_128_96_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_1025_128_96_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_127_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_127_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_128_128_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_128_128_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_128_256_512_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_128_256_512_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_197_224_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_197_224_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_197_240_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_197_240_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_256_256_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_256_256_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_256_512_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_256_512_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_2_1024_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_2_1024_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_31_1024_64_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_31_1024_64_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_45_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_45_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_512_128_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_512_128_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_65_96_112_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_eye_b_eye_fast_accum_True_65_96_112_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_1023_64_48_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_1023_64_48_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_1025_128_96_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_1025_128_96_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_127_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_127_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_128_128_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_128_128_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_128_256_512_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_128_256_512_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_197_224_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_197_224_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_197_240_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_197_240_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_256_256_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_256_256_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_256_512_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_256_512_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_2_1024_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_2_1024_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_31_1024_64_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_31_1024_64_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_45_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_45_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_512_128_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_512_128_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_65_96_112_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_False_65_96_112_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_1023_64_48_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_1023_64_48_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_1025_128_96_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_1025_128_96_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_127_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_127_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_128_128_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_128_128_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_128_256_512_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_128_256_512_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_197_224_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_197_224_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_197_240_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_197_240_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_256_256_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_256_256_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_256_512_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_256_512_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_2_1024_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_2_1024_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_31_1024_64_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_31_1024_64_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_45_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_45_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_512_128_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_512_128_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_65_96_112_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_fast_accum_True_65_96_112_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_1023_64_48_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_1023_64_48_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_1025_128_96_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_1025_128_96_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_127_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_127_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_128_128_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_128_128_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_128_256_512_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_128_256_512_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_197_224_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_197_224_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_197_240_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_197_240_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_256_256_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_256_256_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_256_512_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_256_512_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_2_1024_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_2_1024_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_31_1024_64_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_31_1024_64_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_45_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_45_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_512_128_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_512_128_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_65_96_112_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_False_65_96_112_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_1023_64_48_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_1023_64_48_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_1025_128_96_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_1025_128_96_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_127_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_127_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_128_128_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_128_128_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_128_256_512_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_128_256_512_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_197_224_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_197_224_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_197_240_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_197_240_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_256_256_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_256_256_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_256_512_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_256_512_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_2_1024_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_2_1024_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_31_1024_64_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_31_1024_64_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_45_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_45_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_512_128_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_512_128_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_65_96_112_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_ones_modified_fast_accum_True_65_96_112_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_1023_64_48_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_1023_64_48_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_1025_128_96_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_1025_128_96_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_127_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_127_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_128_128_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_128_128_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_128_256_512_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_128_256_512_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_197_224_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_197_224_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_197_240_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_197_240_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_256_256_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_256_256_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_256_512_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_256_512_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_2_1024_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_2_1024_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_31_1024_64_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_31_1024_64_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_45_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_45_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_512_128_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_512_128_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_65_96_112_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_False_65_96_112_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_1023_64_48_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_1023_64_48_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_1025_128_96_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_1025_128_96_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_127_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_127_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_128_128_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_128_128_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_128_256_512_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_128_256_512_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_197_224_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_197_224_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_197_240_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_197_240_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_256_256_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_256_256_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_256_512_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_256_512_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_2_1024_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_2_1024_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_31_1024_64_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_31_1024_64_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_45_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_45_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_512_128_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_512_128_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_65_96_112_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_b_scale_modified_fast_accum_True_65_96_112_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_1023_64_48_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_1023_64_48_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_1025_128_96_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_1025_128_96_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_127_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_127_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_128_128_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_128_128_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_128_256_512_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_128_256_512_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_197_224_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_197_224_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_197_240_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_197_240_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_256_256_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_256_256_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_256_512_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_256_512_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_2_1024_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_2_1024_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_31_1024_64_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_31_1024_64_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_45_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_45_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_512_128_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_512_128_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_65_96_112_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_False_65_96_112_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_1023_64_48_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_1023_64_48_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_1025_128_96_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_1025_128_96_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_127_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_127_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_128_128_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_128_128_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_128_256_512_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_128_256_512_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_197_224_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_197_224_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_197_240_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_197_240_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_256_256_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_256_256_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_256_512_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_256_512_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_2_1024_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_2_1024_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_31_1024_64_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_31_1024_64_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_45_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_45_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_512_128_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_512_128_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_65_96_112_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_ones_modified_b_ones_fast_accum_True_65_96_112_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_1023_64_48_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_1023_64_48_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_1025_128_96_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_1025_128_96_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_127_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_127_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_128_128_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_128_128_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_128_256_512_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_128_256_512_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_197_224_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_197_224_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_197_240_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_197_240_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_256_256_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_256_256_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_256_512_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_256_512_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_2_1024_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_2_1024_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_31_1024_64_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_31_1024_64_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_45_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_45_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_512_128_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_512_128_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_65_96_112_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_False_65_96_112_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_1023_64_48_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_1023_64_48_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_1025_128_96_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_1025_128_96_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_127_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_127_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_128_128_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_128_128_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_128_256_512_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_128_256_512_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_197_224_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_197_224_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_197_240_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_197_240_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_256_256_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_256_256_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_256_512_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_256_512_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_2_1024_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_2_1024_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_31_1024_64_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_31_1024_64_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_45_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_45_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_512_128_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_512_128_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_65_96_112_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_a_scale_modified_b_ones_fast_accum_True_65_96_112_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_1023_64_48_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_1023_64_48_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_1025_128_96_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_1025_128_96_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_127_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_127_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_128_128_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_128_128_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_128_256_512_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_128_256_512_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_197_224_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_197_224_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_197_240_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_197_240_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_256_256_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_256_256_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_256_512_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_256_512_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_2_1024_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_2_1024_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_31_1024_64_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_31_1024_64_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_45_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_45_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_512_128_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_512_128_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_65_96_112_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_False_65_96_112_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_1023_64_48_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_1023_64_48_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_1025_128_96_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_1025_128_96_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_127_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_127_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_128_128_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_128_128_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_128_256_512_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_128_256_512_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_197_224_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_197_224_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_197_240_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_197_240_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_256_256_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_256_256_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_256_512_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_256_512_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_2_1024_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_2_1024_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_31_1024_64_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_31_1024_64_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_45_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_45_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_512_128_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_512_128_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_65_96_112_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_from_data_fast_accum_True_65_96_112_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_1023_64_48_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_1023_64_48_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_1025_128_96_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_1025_128_96_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_127_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_127_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_128_128_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_128_128_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_128_256_512_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_128_256_512_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_197_224_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_197_224_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_197_240_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_197_240_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_256_256_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_256_256_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_256_512_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_256_512_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_2_1024_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_2_1024_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_31_1024_64_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_31_1024_64_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_45_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_45_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_512_128_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_512_128_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_65_96_112_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_False_65_96_112_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_1023_64_48_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_1023_64_48_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_1025_128_96_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_1025_128_96_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_127_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_127_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_128_128_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_128_128_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_128_256_512_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_128_256_512_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_197_224_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_197_224_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_197_240_272_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_197_240_272_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_256_256_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_256_256_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_256_512_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_256_512_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_2_1024_128_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_2_1024_128_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_31_1024_64_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_31_1024_64_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_45_96_1024_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_45_96_1024_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_512_128_256_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_512_128_256_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_65_96_112_recipe_mxfp8_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_mxfp8_nvfp4_numerics_test_case_name_data_random_scales_one_fast_accum_True_65_96_112_recipe_nvfp4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_blockwise_nvfp4_compile_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_error_message_fp8_pre_sm89_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_float32_output_errors_with_bias_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_float8_basics_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_float8_bias_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_float8_bias_relu_edgecase_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_float8_error_messages_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_float8_rowwise_scaling_sanity_use_fast_accum_False_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_float8_rowwise_scaling_sanity_use_fast_accum_True_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_float8_scale_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_float8_scale_fast_accum_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_honor_sm_carveout_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_non_divisible_leading_dim_bias_False_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_non_divisible_leading_dim_bias_True_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_pack_uint4_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_grouped_gemm_2d_2d_fast_accum_False_strided_False_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_grouped_gemm_2d_2d_fast_accum_False_strided_True_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_grouped_gemm_2d_2d_fast_accum_True_strided_False_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_grouped_gemm_2d_2d_fast_accum_True_strided_True_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_grouped_gemm_2d_3d_fast_accum_False_strided_False_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_grouped_gemm_2d_3d_fast_accum_False_strided_True_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_grouped_gemm_2d_3d_fast_accum_True_strided_False_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_grouped_gemm_2d_3d_fast_accum_True_strided_True_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_grouped_gemm_3d_2d_fast_accum_False_strided_False_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_grouped_gemm_3d_2d_fast_accum_False_strided_True_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_grouped_gemm_3d_2d_fast_accum_True_strided_False_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_grouped_gemm_3d_2d_fast_accum_True_strided_True_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_grouped_gemm_3d_3d_fast_accum_False_strided_False_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_grouped_gemm_3d_3d_fast_accum_False_strided_True_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_grouped_gemm_3d_3d_fast_accum_True_strided_False_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_grouped_gemm_3d_3d_fast_accum_True_strided_True_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_mm_change_stride_bfloat16_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_mm_change_stride_float16_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_mm_change_stride_float32_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_mm_vs_emulated_bfloat16_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_mm_vs_emulated_block_wise_bfloat16_lhs_block_128_rhs_block_1_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_mm_vs_emulated_block_wise_bfloat16_lhs_block_1_rhs_block_128_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_mm_vs_emulated_block_wise_bfloat16_lhs_block_1_rhs_block_1_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_mm_vs_emulated_block_wise_float32_lhs_block_128_rhs_block_1_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_mm_vs_emulated_block_wise_float32_lhs_block_1_rhs_block_128_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_mm_vs_emulated_block_wise_float32_lhs_block_1_rhs_block_1_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_mm_vs_emulated_float16_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_mm_vs_emulated_float32_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_mm_vs_emulated_row_wise_bfloat16_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_scaled_mm_vs_emulated_row_wise_float32_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_zero_dim_tensorwise_which_dim_zero_0_use_torch_compile_False_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_zero_dim_tensorwise_which_dim_zero_0_use_torch_compile_True_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_zero_dim_tensorwise_which_dim_zero_1_use_torch_compile_False_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_zero_dim_tensorwise_which_dim_zero_1_use_torch_compile_True_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_zero_dim_tensorwise_which_dim_zero_2_use_torch_compile_False_cuda, test/test_matmul_cuda.py::TestFP8MatmulCUDA::test_zero_dim_tensorwise_which_dim_zero_2_use_torch_compile_True_cuda 2025-08-14T22:41:46.6331837Z 2025-08-14T22:41:50.5252945Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:41:50.5255373Z import pkg_resources 2025-08-14T22:41:50.9118994Z Running test_weak 1/1 ... [2025-08-14 22:41:50.911315] 2025-08-14T22:41:50.9119467Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:41:50.9121256Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_weak.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:41:50.911783] 2025-08-14T22:41:55.5854646Z 2025-08-14T22:41:55.5855558Z test_weak 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_weak_1.1_e2954cfebf7b943d_.log 2025-08-14T22:41:55.5865112Z Running 39 items in this shard: test/test_weak.py::WeakTest::test_make_weak_keyed_dict_from_dict, test/test_weak.py::WeakTest::test_make_weak_keyed_dict_from_weak_keyed_dict, test/test_weak.py::WeakTest::test_make_weak_keyed_dict_repr, test/test_weak.py::WeakTest::test_threaded_weak_key_dict_copy, test/test_weak.py::WeakTest::test_threaded_weak_key_dict_deepcopy, test/test_weak.py::WeakTest::test_weak_keyed_bad_delitem, test/test_weak.py::WeakTest::test_weak_keyed_delitem, test/test_weak.py::WeakTest::test_weak_keyed_dict_popitem, test/test_weak.py::WeakTest::test_weak_keyed_dict_setdefault, test/test_weak.py::WeakTest::test_weak_keyed_dict_update, test/test_weak.py::WeakTest::test_weak_keyed_union_operators, test/test_weak.py::WeakKeyDictionaryTestCase::test_bool, test/test_weak.py::WeakKeyDictionaryTestCase::test_constructor, test/test_weak.py::WeakKeyDictionaryTestCase::test_get, test/test_weak.py::WeakKeyDictionaryTestCase::test_getitem, test/test_weak.py::WeakKeyDictionaryTestCase::test_items, test/test_weak.py::WeakKeyDictionaryTestCase::test_keys, test/test_weak.py::WeakKeyDictionaryTestCase::test_len, test/test_weak.py::WeakKeyDictionaryTestCase::test_pop, test/test_weak.py::WeakKeyDictionaryTestCase::test_popitem, test/test_weak.py::WeakKeyDictionaryTestCase::test_read, test/test_weak.py::WeakKeyDictionaryTestCase::test_setdefault, test/test_weak.py::WeakKeyDictionaryTestCase::test_update, test/test_weak.py::WeakKeyDictionaryTestCase::test_values, test/test_weak.py::WeakKeyDictionaryTestCase::test_write, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_bool, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_constructor, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_get, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_getitem, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_items, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_keys, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_len, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_pop, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_popitem, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_read, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_setdefault, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_update, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_values, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_write 2025-08-14T22:41:55.5874301Z 2025-08-14T22:41:59.7170629Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:41:59.7171966Z import pkg_resources 2025-08-14T22:42:00.0996635Z Running functorch/test_logging 1/1 ... [2025-08-14 22:42:00.099177] 2025-08-14T22:42:00.0997062Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:42:00.1000027Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_logging.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:42:00.099627] 2025-08-14T22:42:04.9229641Z 2025-08-14T22:42:04.9231268Z functorch/test_logging 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_logging_1.1_8cc33e680b0486f7_.log 2025-08-14T22:42:04.9232833Z Running 1 items in this shard: test/functorch/test_logging.py::TestAOTLogging::test_logging 2025-08-14T22:42:04.9234015Z 2025-08-14T22:42:09.0673540Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:42:09.0675747Z import pkg_resources 2025-08-14T22:42:09.4600951Z Running test_logging 1/1 ... [2025-08-14 22:42:09.459551] 2025-08-14T22:42:09.4601431Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:42:09.4603114Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_logging.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:42:09.459963] 2025-08-14T22:42:14.0842888Z 2025-08-14T22:42:14.0843860Z test_logging 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_logging_1.1_753319c24f227905_.log 2025-08-14T22:42:14.0844710Z Running 1 items in this shard: test/test_logging.py::LoggingTest::testApiUsage 2025-08-14T22:42:14.0845033Z 2025-08-14T22:42:18.2445228Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:42:18.2446560Z import pkg_resources 2025-08-14T22:42:18.6325980Z Running test_out_dtype_op 1/1 ... [2025-08-14 22:42:18.632053] 2025-08-14T22:42:18.6326585Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:42:18.6328761Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_out_dtype_op.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:42:18.632485] 2025-08-14T22:42:23.7073797Z 2025-08-14T22:42:23.7074769Z test_out_dtype_op 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_out_dtype_op_1.1_6ce46e0c26e8f090_.log 2025-08-14T22:42:23.7080434Z Running 12 items in this shard: test/test_out_dtype_op.py::TestOutDtypeOp::test_out_dtype_dynamo, test/test_out_dtype_op.py::TestOutDtypeOp::test_out_dtype_inductor_decomp, test/test_out_dtype_op.py::TestOutDtypeOp::test_out_dtype_inductor_decomp_trace, test/test_out_dtype_op.py::TestOutDtypeOp::test_out_dtype_int_mm_default_trace, test/test_out_dtype_op.py::TestOutDtypeOp::test_out_dtype_make_fx, test/test_out_dtype_op.py::TestOutDtypeOp::test_out_dtype_mm_numerical, test/test_out_dtype_op.py::TestOutDtypeOp::test_out_dtype_mul_scalar_numerical, test/test_out_dtype_op.py::TestOutDtypeOp::test_out_dtype_no_autograd, test/test_out_dtype_op.py::TestOutDtypeOp::test_out_dtype_non_functional, test/test_out_dtype_op.py::TestOutDtypeOp::test_out_dtype_non_op_overload, test/test_out_dtype_op.py::TestOutDtypeOp::test_out_dtype_op_functional, test/test_out_dtype_op.py::TestOutDtypeOp::test_out_dtype_wrong_output 2025-08-14T22:42:23.7083456Z 2025-08-14T22:42:27.8524638Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:42:27.8526808Z import pkg_resources 2025-08-14T22:42:28.2442747Z Running test_complex 1/1 ... [2025-08-14 22:42:28.243703] 2025-08-14T22:42:28.2443191Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:42:28.2445538Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_complex.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:42:28.244100] 2025-08-14T22:42:28.8738359Z 2025-08-14T22:42:28.8739364Z test_binary_ufuncs 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_binary_ufuncs_1.1_446d3fa2c7f05fb6_.log 2025-08-14T22:42:29.3281111Z Running 12857 items in this shard: test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___add___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___add___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___add___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___add___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___add___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___add___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___add___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___add___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___and___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___and___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___and___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___and___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___and___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___and___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___and___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___and___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___eq___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___eq___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___eq___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___eq___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___eq___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___eq___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___eq___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___eq___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___floordiv___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___floordiv___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___floordiv___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___floordiv___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___floordiv___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___floordiv___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___floordiv___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___floordiv___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ge___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ge___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ge___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ge___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ge___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ge___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ge___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ge___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___gt___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___gt___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___gt___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___gt___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___gt___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___gt___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___gt___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___gt___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iadd___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iadd___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iadd___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iadd___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iadd___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iadd___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iadd___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iadd___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iand___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iand___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iand___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iand___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iand___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iand___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iand___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iand___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ifloordiv___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ifloordiv___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ifloordiv___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ifloordiv___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ifloordiv___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ifloordiv___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ifloordiv___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ifloordiv___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ilshift___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ilshift___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ilshift___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ilshift___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ilshift___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ilshift___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ilshift___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ilshift___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imod___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imod___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imod___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imod___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imod___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imod___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imod___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imod___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imul___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imul___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imul___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imul___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imul___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imul___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imul___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imul___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ior___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ior___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ior___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ior___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ior___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ior___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ior___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ior___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ipow___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ipow___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ipow___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ipow___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ipow___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ipow___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ipow___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ipow___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___irshift___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___irshift___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___irshift___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___irshift___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___irshift___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___irshift___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___irshift___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___irshift___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___isub___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___isub___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___isub___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___isub___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___isub___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___isub___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___isub___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___isub___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___itruediv___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___itruediv___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___itruediv___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___itruediv___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___itruediv___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___itruediv___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___itruediv___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___itruediv___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ixor___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ixor___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ixor___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ixor___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ixor___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ixor___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ixor___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ixor___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___le___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___le___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___le___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___le___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___le___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___le___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___le___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___le___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lshift___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lshift___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lshift___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lshift___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lshift___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lshift___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lshift___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lshift___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lt___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lt___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lt___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lt___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lt___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lt___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lt___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lt___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___matmul___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___matmul___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___matmul___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___matmul___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___matmul___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___matmul___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___matmul___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___matmul___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mod___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mod___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mod___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mod___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mod___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mod___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mod___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mod___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mul___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mul___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mul___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mul___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mul___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mul___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mul___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mul___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ne___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ne___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ne___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ne___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ne___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ne___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ne___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ne___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___or___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___or___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___or___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___or___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___or___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___or___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___or___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___or___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___pow___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___pow___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___pow___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___pow___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___pow___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___pow___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___pow___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___pow___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___radd___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___radd___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___radd___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___radd___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___radd___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___radd___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___radd___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___radd___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rand___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rand___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rand___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rand___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rand___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rand___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rand___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rand___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rfloordiv___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rfloordiv___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rfloordiv___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rfloordiv___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rfloordiv___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rfloordiv___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rfloordiv___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rfloordiv___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rlshift___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rlshift___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rlshift___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rlshift___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rlshift___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rlshift___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rlshift___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rlshift___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmatmul___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmatmul___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmatmul___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmatmul___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmatmul___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmatmul___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmatmul___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmatmul___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmod___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmod___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmod___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmod___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmod___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmod___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmod___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmod___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmul___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmul___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmul___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmul___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmul___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmul___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmul___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmul___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ror___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ror___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ror___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ror___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ror___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ror___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ror___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ror___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rpow___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rpow___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rpow___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rpow___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rpow___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rpow___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rpow___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rpow___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rrshift___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rrshift___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rrshift___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rrshift___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rrshift___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rrshift___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rrshift___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rrshift___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rshift___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rshift___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rshift___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rshift___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rshift___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rshift___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rshift___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rshift___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rsub___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rsub___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rsub___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rsub___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rsub___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rsub___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rsub___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rsub___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rtruediv___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rtruediv___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rtruediv___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rtruediv___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rtruediv___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rtruediv___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rtruediv___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rtruediv___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rxor___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rxor___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rxor___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rxor___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rxor___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rxor___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rxor___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rxor___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___sub___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___sub___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___sub___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___sub___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___sub___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___sub___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___sub___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___sub___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___truediv___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___truediv___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___truediv___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___truediv___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___truediv___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___truediv___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___truediv___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___truediv___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___xor___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___xor___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___xor___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___xor___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___xor___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___xor___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___xor___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___xor___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_add_broadcast_empty_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_add_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_add_with_tail_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_addcmul_scalars_as_floats_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_addsub_half_tensor_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_atan2_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_atan2_edgecases_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rand___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rand___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rand___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rand___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rand___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rand___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmod___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmod___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmod___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmod___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmod___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmod___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmod___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmod___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmod___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___ror___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___ror___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___ror___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___ror___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___ror___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___ror___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rpow___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rpow___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rpow___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rpow___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rpow___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rpow___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rpow___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rpow___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rpow___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rpow___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rpow___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rsub___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rsub___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rsub___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rsub___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rsub___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rsub___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rsub___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rsub___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rsub___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rsub___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rsub___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rxor___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rxor___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rxor___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rxor___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rxor___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rxor___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_max_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_max_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_max_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_max_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_max_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_min_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_min_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_min_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_min_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_min_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_h_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_h_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_h_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_h_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_h_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_h_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_h_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_h_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_he_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_he_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_he_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_he_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_he_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_he_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_he_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_he_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_laguerre_polynomial_l_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_laguerre_polynomial_l_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_laguerre_polynomial_l_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_laguerre_polynomial_l_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_laguerre_polynomial_l_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_laguerre_polynomial_l_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_laguerre_polynomial_l_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_laguerre_polynomial_l_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_legendre_polynomial_p_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_legendre_polynomial_p_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_legendre_polynomial_p_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_legendre_polynomial_p_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_legendre_polynomial_p_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_legendre_polynomial_p_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_legendre_polynomial_p_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_legendre_polynomial_p_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_binary_op_mem_overlap_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_binary_op_scalar_device_unspecified_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_binary_ops_with_scalars_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_bitwise_ops_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_bitwise_ops_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_bitwise_ops_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_bitwise_ops_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_bitwise_ops_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_bitwise_ops_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_bool_tensor_comparison_ops_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_cdiv_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_cmul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_bfloat16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_bfloat16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_bool_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_complex128_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_complex128_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_complex128_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_complex128_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_complex64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_complex64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_complex64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_complex64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_complex64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int32_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int32_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int8_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int8_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_uint8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_uint8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_uint8_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_uint8_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_uint8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_uint8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_uint8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_uint8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_uint8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_uint8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_div_underflow_overflow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_div_underflow_overflow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_scalar_pow_tensor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_scalar_pow_tensor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_scalar_pow_tensor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_scalar_pow_tensor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_scalar_pow_tensor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_scalar_pow_tensor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_scalar_pow_tensor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_scalar_pow_tensor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_scalar_pow_tensor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_scalar_pow_tensor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rand___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rand___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rand___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rand___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rand___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rand___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmod___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmod___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmod___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmod___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmod___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmod___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmod___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmod___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmod___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___ror___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___ror___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___ror___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___ror___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___ror___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___ror___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rpow___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rpow___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rpow___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rpow___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rpow___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rpow___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rpow___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rpow___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rpow___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rpow___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rpow___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rsub___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rsub___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rsub___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rsub___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rsub___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rsub___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rsub___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rsub___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rsub___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rsub___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rsub___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rxor___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rxor___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rxor___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rxor___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rxor___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rxor___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs__conversions_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs__conversions_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs__conversions_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs__conversions_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs__conversions_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rand___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rand___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rand___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rand___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rand___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rand___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmod___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmod___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmod___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmod___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmod___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmod___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmod___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmod___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmod___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___ror___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___ror___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___ror___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___ror___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___ror___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___ror___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rpow___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rpow___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rpow___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rpow___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rpow___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rpow___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rpow___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rpow___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rpow___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rpow___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rpow___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rsub___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rsub___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rsub___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rsub___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rsub___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rsub___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rsub___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rsub___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rsub___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rsub___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rsub___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rxor___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rxor___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rxor___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rxor___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rxor___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rxor___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_max_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_max_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_max_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_max_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_max_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_min_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_min_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_min_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_min_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_min_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_h_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_h_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_h_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_h_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_h_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_h_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_h_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_h_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_he_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_he_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_he_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_he_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_he_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_he_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_he_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_he_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_laguerre_polynomial_l_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_laguerre_polynomial_l_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_laguerre_polynomial_l_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_laguerre_polynomial_l_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_laguerre_polynomial_l_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_laguerre_polynomial_l_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_laguerre_polynomial_l_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_laguerre_polynomial_l_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_legendre_polynomial_p_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_legendre_polynomial_p_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_legendre_polynomial_p_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_legendre_polynomial_p_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_legendre_polynomial_p_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_legendre_polynomial_p_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_legendre_polynomial_p_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_legendre_polynomial_p_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_max_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_max_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_max_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_max_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_max_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_min_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_min_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_min_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_min_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_min_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_h_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_h_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_h_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_h_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_h_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_h_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_h_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_h_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_he_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_he_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_he_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_he_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_he_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_he_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_he_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_he_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_laguerre_polynomial_l_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_laguerre_polynomial_l_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_laguerre_polynomial_l_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_laguerre_polynomial_l_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_laguerre_polynomial_l_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_laguerre_polynomial_l_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_laguerre_polynomial_l_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_laguerre_polynomial_l_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_legendre_polynomial_p_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_legendre_polynomial_p_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_legendre_polynomial_p_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_legendre_polynomial_p_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_legendre_polynomial_p_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_legendre_polynomial_p_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_legendre_polynomial_p_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_legendre_polynomial_p_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rand___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rand___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rand___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rand___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rand___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rand___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmod___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmod___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmod___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmod___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmod___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmod___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmod___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmod___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmod___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___ror___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___ror___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___ror___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___ror___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___ror___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___ror___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rpow___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rpow___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rpow___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rpow___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rpow___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rpow___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rpow___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rpow___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rpow___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rpow___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rpow___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rsub___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rsub___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rsub___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rsub___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rsub___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rsub___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rsub___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rsub___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rsub___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rsub___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rsub___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rxor___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rxor___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rxor___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rxor___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rxor___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rxor___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_max_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_max_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_max_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_max_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_max_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_min_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_min_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_min_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_min_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_min_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_h_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_h_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_h_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_h_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_h_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_h_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_h_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_h_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_he_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_he_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_he_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_he_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_he_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_he_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_he_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_he_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_laguerre_polynomial_l_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_laguerre_polynomial_l_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_laguerre_polynomial_l_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_laguerre_polynomial_l_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_laguerre_polynomial_l_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_laguerre_polynomial_l_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_laguerre_polynomial_l_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_laguerre_polynomial_l_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_legendre_polynomial_p_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_legendre_polynomial_p_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_legendre_polynomial_p_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_legendre_polynomial_p_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_legendre_polynomial_p_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_legendre_polynomial_p_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_legendre_polynomial_p_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_legendre_polynomial_p_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rand___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rand___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rand___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rand___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rand___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rand___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmod___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmod___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmod___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmod___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmod___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmod___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmod___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmod___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmod___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___ror___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___ror___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___ror___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___ror___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___ror___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___ror___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rpow___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rpow___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rpow___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rpow___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rpow___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rpow___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rpow___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rpow___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rpow___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rpow___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rpow___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rsub___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rsub___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rsub___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rsub___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rsub___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rsub___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rsub___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rsub___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rsub___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rsub___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rsub___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rxor___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rxor___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rxor___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rxor___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rxor___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rxor___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_max_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_max_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_max_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_max_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_max_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_min_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_min_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_min_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_min_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_min_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_h_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_h_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_h_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_h_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_h_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_h_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_h_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_h_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_he_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_he_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_he_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_he_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_he_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_he_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_he_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_he_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_laguerre_polynomial_l_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_laguerre_polynomial_l_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_laguerre_polynomial_l_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_laguerre_polynomial_l_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_laguerre_polynomial_l_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_laguerre_polynomial_l_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_laguerre_polynomial_l_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_laguerre_polynomial_l_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_legendre_polynomial_p_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_legendre_polynomial_p_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_legendre_polynomial_p_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_legendre_polynomial_p_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_legendre_polynomial_p_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_legendre_polynomial_p_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_legendre_polynomial_p_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_legendre_polynomial_p_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bfloat16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bfloat16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bfloat16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bfloat16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bfloat16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bfloat16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bfloat16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bfloat16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bfloat16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bfloat16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bool_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bool_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bool_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bool_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bool_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bool_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bool_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bool_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bool_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bool_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_uint8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_uint8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_uint8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_uint8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_uint8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_uint8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_uint8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_uint8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_uint8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_uint8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_bfloat16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_bfloat16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_bfloat16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_bfloat16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_cpow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_cpu_tensor_pow_cuda_scalar_tensor_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_cremainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_cross_device_binary_ops_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_cross_device_inplace_error_msg_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_csub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_cuda_tensor_pow_scalar_tensor_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_cumulative_trapezoid_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_and_floordiv_script_vs_python_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_and_floordiv_vs_python_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_modes_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_modes_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_modes_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_modes_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_modes_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_modes_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_modes_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_modes_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_modes_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_nonfinite_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_nonfinite_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_nonfinite_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_nonfinite_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_numpy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_numpy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_numpy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_numpy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_numpy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_numpy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_numpy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_numpy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_divide_by_zero_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_divide_by_zero_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_divide_by_zero_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_divide_by_zero_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_divmul_scalar_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_bfloat16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_bfloat16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_bfloat16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_bfloat16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_bfloat16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_bfloat16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_bfloat16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_bfloat16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_bfloat16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_bfloat16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_bfloat16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex128_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex128_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex128_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex128_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex128_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex128_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex128_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex128_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex128_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex128_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex128_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float32_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float32_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int32_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int32_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int8_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int8_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_uint8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_uint8_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_uint8_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_uint8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_uint8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_uint8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_uint8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_uint8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_uint8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_uint8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_uint8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_exceptions_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_scalar_pow_float_tensor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_scalar_pow_float_tensor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_div_extremal_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_div_extremal_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_div_extremal_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_div_extremal_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_scalar_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_scalar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_scalar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_scalar_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_scalar_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_scalar_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_scalar_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_scalar_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_tensor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_tensor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_tensor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_tensor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_tensor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_tensor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_tensor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_tensor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_zero_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_zero_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_zero_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_zero_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_zero_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_by_zero_float_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_by_zero_float_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_by_zero_float_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_by_zero_integral_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_by_zero_integral_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_by_zero_integral_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_by_zero_integral_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_by_zero_integral_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_complex_cuda_complex128_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_complex_cuda_complex128_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_complex_cuda_complex64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_complex_cuda_complex64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cross_device_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bfloat16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bfloat16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bfloat16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bfloat16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bfloat16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bfloat16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bfloat16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bfloat16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bfloat16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bfloat16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bool_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bool_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bool_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bool_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bool_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bool_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bool_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bool_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bool_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bool_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_uint8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_uint8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_uint8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_uint8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_uint8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_uint8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_uint8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_uint8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_uint8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_uint8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_idiv_and_ifloordiv_vs_python_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_inplace_division_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_inplace_dunders_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_int_and_float_pow_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_int_tensor_pow_neg_ints_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_ldexp_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_lowp_cpu_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_lowp_cpu_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_lowp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_lowp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_weight_scalar_tensor_promotion_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_weight_scalar_tensor_promotion_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_weight_scalar_tensor_promotion_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_weight_scalar_tensor_promotion_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_weight_tensor_promotion_error_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_weight_tensor_promotion_error_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_weight_tensor_promotion_error_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logaddexp2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logaddexp2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logaddexp2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_with_nontrivial_alignment_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_long_tensor_pow_floats_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_bfloat16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_bfloat16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_bfloat16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_bfloat16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_cross_device_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_float_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_float_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_float_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_float_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_float_nan_and_inf_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_float_nan_and_inf_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_float_nan_and_inf_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_float_nan_and_inf_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_forward_ad_float32_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_int_and_bool_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_int_and_bool_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_int_and_bool_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_int_and_bool_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_int_and_bool_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_int_and_bool_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bfloat16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bfloat16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bfloat16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bfloat16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bfloat16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bfloat16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bfloat16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bfloat16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bfloat16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bfloat16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bool_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bool_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bool_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bool_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bool_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bool_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bool_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bool_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bool_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bool_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_uint8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_uint8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_uint8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_uint8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_uint8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_uint8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_uint8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_uint8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_uint8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_uint8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_min_max_binary_op_nan_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_min_max_binary_op_nan_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_min_max_binary_op_nan_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_mul_chalf_tensor_and_cpu_scalar_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_mul_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_mul_intertype_scalar_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_mul_intertype_scalar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_mul_intertype_scalar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_nextafter_bfloat16_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rand___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rand___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rand___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rand___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rand___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rand___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmod___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmod___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmod___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmod___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmod___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmod___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmod___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmod___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmod___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___ror___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___ror___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___ror___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___ror___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___ror___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___ror___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rpow___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rpow___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rpow___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rpow___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rpow___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rpow___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rpow___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rpow___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rpow___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rpow___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rpow___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rsub___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rsub___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rsub___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rsub___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rsub___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rsub___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rsub___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rsub___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rsub___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rsub___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rsub___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rxor___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rxor___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rxor___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rxor___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rxor___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rxor___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs__conversions_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs__conversions_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs__conversions_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs__conversions_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs__conversions_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rand___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rand___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rand___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rand___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rand___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rand___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmod___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmod___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmod___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmod___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmod___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmod___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmod___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmod___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmod___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___ror___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___ror___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___ror___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___ror___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___ror___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___ror___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rpow___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rpow___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rpow___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rpow___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rpow___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rpow___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rpow___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rpow___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rpow___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rpow___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rpow___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rsub___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rsub___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rsub___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rsub___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rsub___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rsub___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rsub___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rsub___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rsub___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rsub___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rsub___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rxor___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rxor___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rxor___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rxor___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rxor___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rxor___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs__conversions_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs__conversions_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs__conversions_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs__conversions_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs__conversions_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_max_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_max_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_max_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_max_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_max_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_min_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_min_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_min_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_min_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_min_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_h_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_h_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_h_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_h_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_h_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_h_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_h_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_h_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_he_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_he_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_he_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_he_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_he_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_he_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_he_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_he_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_laguerre_polynomial_l_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_laguerre_polynomial_l_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_laguerre_polynomial_l_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_laguerre_polynomial_l_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_laguerre_polynomial_l_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_laguerre_polynomial_l_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_laguerre_polynomial_l_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_laguerre_polynomial_l_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_legendre_polynomial_p_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_legendre_polynomial_p_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_legendre_polynomial_p_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_legendre_polynomial_p_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_legendre_polynomial_p_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_legendre_polynomial_p_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_legendre_polynomial_p_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_legendre_polynomial_p_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rand___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rand___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rand___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rand___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rand___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rand___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmod___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmod___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmod___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmod___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmod___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmod___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmod___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmod___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmod___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___ror___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___ror___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___ror___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___ror___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___ror___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___ror___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rpow___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rpow___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rpow___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rpow___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rpow___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rpow___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rpow___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rpow___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rpow___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rpow___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rpow___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rsub___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rsub___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rsub___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rsub___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rsub___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rsub___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rsub___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rsub___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rsub___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rsub___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rsub___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rxor___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rxor___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rxor___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rxor___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rxor___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rxor___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs__conversions_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs__conversions_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs__conversions_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs__conversions_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs__conversions_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_max_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_max_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_max_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_max_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_max_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_min_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_min_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_min_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_min_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_min_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_h_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_h_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_h_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_h_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_h_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_h_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_h_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_h_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_he_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_he_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_he_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_he_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_he_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_he_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_he_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_he_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_laguerre_polynomial_l_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_laguerre_polynomial_l_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_laguerre_polynomial_l_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_laguerre_polynomial_l_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_laguerre_polynomial_l_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_laguerre_polynomial_l_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_laguerre_polynomial_l_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_laguerre_polynomial_l_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_legendre_polynomial_p_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_legendre_polynomial_p_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_legendre_polynomial_p_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_legendre_polynomial_p_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_legendre_polynomial_p_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_legendre_polynomial_p_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_legendre_polynomial_p_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_legendre_polynomial_p_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_max_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_max_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_max_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_max_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_max_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_min_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_min_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_min_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_min_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_min_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_h_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_h_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_h_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_h_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_h_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_h_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_h_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_h_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_he_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_he_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_he_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_he_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_he_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_he_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_he_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_he_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_laguerre_polynomial_l_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_laguerre_polynomial_l_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_laguerre_polynomial_l_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_laguerre_polynomial_l_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_laguerre_polynomial_l_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_laguerre_polynomial_l_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_laguerre_polynomial_l_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_laguerre_polynomial_l_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_legendre_polynomial_p_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_legendre_polynomial_p_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_legendre_polynomial_p_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_legendre_polynomial_p_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_legendre_polynomial_p_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_legendre_polynomial_p_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_legendre_polynomial_p_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_legendre_polynomial_p_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable___radd___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable___rdiv___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable___rmod___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable___rmul___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable___rpow___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable___rsub___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs__conversions_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs__conversions_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_ldexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_hermite_polynomial_h_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_hermite_polynomial_he_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_laguerre_polynomial_l_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_legendre_polynomial_p_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_out_resize_warning_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_complex_extremal_passing_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_complex_extremal_passing_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_inplace_resizing_exception_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_scalar_base_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_scalar_overloads_mem_overlap_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_scalar_type_promotion_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_rdiv_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_rdiv_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_rdiv_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_rdiv_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_rdiv_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_rdiv_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_rdiv_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_rdiv_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_rdiv_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_return_by_ref_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_return_by_ref_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_max_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_max_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_min_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_min_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_max_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_max_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_max_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_max_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_max_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_min_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_min_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_min_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_min_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_min_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_max_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_max_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_max_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_max_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_max_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_min_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_min_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_min_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_min_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_min_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_remainder_fmod_large_dividend_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_remainder_fmod_large_dividend_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_remainder_overflow_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_rpow_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_shift_limits_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_shift_limits_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_shift_limits_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_shift_limits_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_shift_limits_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_signed_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_signed_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_signed_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_signed_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_typing_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_tensor_pow_tensor_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_trapezoid_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_true_divide_out_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_true_divide_out_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion___radd___cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion___rand___cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion___rdiv___cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion___rmod___cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion___rmul___cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion___ror___cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion___rpow___cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion___rsub___cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion___rxor___cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs__conversions_complex_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs__conversions_polar_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_add_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_atan2_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_bitwise_and_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_bitwise_left_shift_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_bitwise_or_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_bitwise_right_shift_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_bitwise_xor_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_clamp_max_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_clamp_min_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_copysign_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_div_floor_rounding_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_div_no_rounding_mode_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_div_trunc_rounding_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_eq_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_float_power_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_floor_divide_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_fmax_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_fmin_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_fmod_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_gcd_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_ge_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_gt_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_heaviside_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_hypot_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_igamma_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_igammac_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_isclose_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_lcm_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_le_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_logaddexp_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_logical_and_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_logical_or_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_logical_xor_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_lt_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_maximum_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_minimum_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_mul_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_ne_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_nextafter_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_pow_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_remainder_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_rsub_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_special_xlog1py_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_special_zeta_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_sub_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_true_divide_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_xlogy_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_add_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_atan2_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_bitwise_and_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_bitwise_left_shift_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_bitwise_or_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_bitwise_right_shift_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_bitwise_xor_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_clamp_max_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_clamp_min_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_complex_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_copysign_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_div_floor_rounding_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_div_no_rounding_mode_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_div_trunc_rounding_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_eq_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_float_power_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_floor_divide_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_fmax_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_fmin_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_fmod_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_gcd_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_ge_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_gt_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_heaviside_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_hypot_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_igamma_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_igammac_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_isclose_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_jiterator_binary_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_jiterator_binary_return_by_ref_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_lcm_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_ldexp_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_le_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_logaddexp_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_logical_and_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_logical_or_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_logical_xor_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_lt_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_max_binary_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_maximum_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_min_binary_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_minimum_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_mul_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_ne_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_nextafter_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_polar_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_pow_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_remainder_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_rsub_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_chebyshev_polynomial_t_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_chebyshev_polynomial_u_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_chebyshev_polynomial_v_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_chebyshev_polynomial_w_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_hermite_polynomial_h_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_hermite_polynomial_he_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_laguerre_polynomial_l_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_legendre_polynomial_p_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_shifted_chebyshev_polynomial_t_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_shifted_chebyshev_polynomial_u_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_shifted_chebyshev_polynomial_v_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_shifted_chebyshev_polynomial_w_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_xlog1py_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_zeta_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_sub_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_true_divide_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_xlogy_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_bfloat16_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_bool_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_bool_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_bool_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_bool_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_bool_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_bool_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_bool_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_bool_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_bool_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_uint8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_uint8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_uint8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_uint8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_uint8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_uint8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_uint8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_uint8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_uint8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_gradients_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_scalar_type_promotion_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_bool_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_bool_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_bool_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_bool_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_bool_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_bool_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_bool_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_bool_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_uint8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_uint8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_uint8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_uint8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_uint8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_uint8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_uint8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_uint8_uint8 2025-08-14T22:42:29.7606522Z 2025-08-14T22:42:32.9681653Z 2025-08-14T22:42:32.9682367Z test_complex 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_complex_1.1_db60d7a66c37f082_.log 2025-08-14T22:42:32.9686873Z Running 15 items in this shard: test/test_complex.py::TestComplexTensorCUDA::test_all_cuda_complex128, test/test_complex.py::TestComplexTensorCUDA::test_all_cuda_complex64, test/test_complex.py::TestComplexTensorCUDA::test_any_cuda_complex128, test/test_complex.py::TestComplexTensorCUDA::test_any_cuda_complex64, test/test_complex.py::TestComplexTensorCUDA::test_conj_copy_cuda_complex128, test/test_complex.py::TestComplexTensorCUDA::test_conj_copy_cuda_complex64, test/test_complex.py::TestComplexTensorCUDA::test_dtype_inference_cuda_float16, test/test_complex.py::TestComplexTensorCUDA::test_dtype_inference_cuda_float32, test/test_complex.py::TestComplexTensorCUDA::test_dtype_inference_cuda_float64, test/test_complex.py::TestComplexTensorCUDA::test_eq_cuda_complex128, test/test_complex.py::TestComplexTensorCUDA::test_eq_cuda_complex64, test/test_complex.py::TestComplexTensorCUDA::test_ne_cuda_complex128, test/test_complex.py::TestComplexTensorCUDA::test_ne_cuda_complex64, test/test_complex.py::TestComplexTensorCUDA::test_to_list_cuda_complex128, test/test_complex.py::TestComplexTensorCUDA::test_to_list_cuda_complex64 2025-08-14T22:42:32.9691113Z 2025-08-14T22:42:33.0220573Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:42:33.0221925Z import pkg_resources 2025-08-14T22:42:33.4123296Z Running cpp_extensions/python_agnostic_extension/test/test_python_agnostic 1/1 ... [2025-08-14 22:42:33.411822] 2025-08-14T22:42:33.4124011Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:42:33.4125780Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'cpp_extensions/python_agnostic_extension/test/test_python_agnostic.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:42:33.412207] 2025-08-14T22:42:37.1028481Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:42:37.1030704Z import pkg_resources 2025-08-14T22:42:37.5145577Z Running torch_np/numpy_tests/core/test_einsum 1/1 ... [2025-08-14 22:42:37.514058] 2025-08-14T22:42:37.5146235Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:42:37.5149770Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_einsum.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:42:37.514549] 2025-08-14T22:42:42.2395674Z 2025-08-14T22:42:42.2396872Z torch_np/numpy_tests/core/test_einsum 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_einsum_1.1_d527d7857f2c30ab_.log 2025-08-14T22:42:42.2412594Z Running 50 items in this shard: test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_broadcasting_dot_cases, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_collapse, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_combined_views_mapping, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_complex, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_different_paths_dtype_B, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_different_paths_dtype_D, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_different_paths_dtype_F, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_different_paths_dtype_b, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_different_paths_dtype_d, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_different_paths_dtype_e, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_different_paths_dtype_f, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_different_paths_dtype_h, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_different_paths_dtype_i, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_different_paths_dtype_l, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_edge_cases, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_einsum_all_contig_non_contig_output, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_einsum_broadcast, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_einsum_errors, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_einsum_failed_on_p9_and_s390x, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_einsum_fixed_collapsingbug, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_einsum_fixedstridebug, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_einsum_misc, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_einsum_sums_cfloat128, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_einsum_sums_cfloat64, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_einsum_sums_float16, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_einsum_sums_float32, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_einsum_sums_float64, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_einsum_sums_int16, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_einsum_sums_int32, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_einsum_sums_int64, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_einsum_sums_int8, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_einsum_sums_uint8, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_einsum_views, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_expand, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_hadamard_like_products, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_index_transformations, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_inner_product, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_out_is_res, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_output_order, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_random_cases, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_small_boolean_arrays, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsum::test_subscript_range, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsumPath::test_edge_paths, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsumPath::test_long_paths, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsumPath::test_memory_contraints, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsumPath::test_path_type_input, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsumPath::test_path_type_input_internal_trace, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsumPath::test_path_type_input_invalid, test/torch_np/numpy_tests/core/test_einsum.py::TestEinsumPath::test_spaces, test/torch_np/numpy_tests/core/test_einsum.py::TestMisc::test_overlap 2025-08-14T22:42:42.2427544Z 2025-08-14T22:42:44.8486822Z 2025-08-14T22:42:44.8488316Z cpp_extensions/python_agnostic_extension/test/test_python_agnostic 1/1 was successful, full logs can be found in artifacts with path test/test-reports/cpp_extensions.python_agnostic_extension.test.test_python_agnostic_1.1_58a80645f153ece8_.log 2025-08-14T22:42:44.8489863Z Running 1 items in this shard: test/cpp_extensions/python_agnostic_extension/test/test_python_agnostic.py::TestPythonAgnosticCUDA::test_extension_is_python_agnostic_cuda 2025-08-14T22:42:44.8490520Z 2025-08-14T22:42:46.3465489Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:42:46.3467290Z import pkg_resources 2025-08-14T22:42:46.7438254Z Running test_ops_jit 1/1 ... [2025-08-14 22:42:46.743292] 2025-08-14T22:42:46.7438626Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:42:46.7442382Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops_jit.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:42:46.743689] 2025-08-14T22:42:48.9646274Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:42:48.9647812Z import pkg_resources 2025-08-14T22:42:49.3515436Z Running lazy/test_step_closures 1/1 ... [2025-08-14 22:42:49.351013] 2025-08-14T22:42:49.3515895Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:42:49.3517787Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'lazy/test_step_closures.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:42:49.351440] 2025-08-14T22:42:54.0258456Z 2025-08-14T22:42:54.0259278Z lazy/test_step_closures 1/1 was successful, full logs can be found in artifacts with path test/test-reports/lazy.test_step_closures_1.1_0800ded4aaf37df2_.log 2025-08-14T22:42:54.0261071Z Running 4 items in this shard: test/lazy/test_step_closures.py::ClosuresTest::test_asynchronous, test/lazy/test_step_closures.py::ClosuresTest::test_asynchronous_exception, test/lazy/test_step_closures.py::ClosuresTest::test_synchronous, test/lazy/test_step_closures.py::ClosuresTest::test_synchronous_exception 2025-08-14T22:42:54.0262640Z 2025-08-14T22:42:58.1143848Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:42:58.1145176Z import pkg_resources 2025-08-14T22:42:58.5019245Z Running test_ops_gradients 1/1 ... [2025-08-14 22:42:58.501418] 2025-08-14T22:42:58.5019673Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:42:58.5021919Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops_gradients.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:42:58.501828] 2025-08-14T22:44:00.6092271Z 2025-08-14T22:44:00.6095962Z test_decomp 5/16 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_5.16_5ce3b51767e4ff8d_.log 2025-08-14T22:44:00.6257157Z Running 574 items in this shard: test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rdiv___cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rdiv___cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmod___cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmul___cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rsub___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rsub___cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive__chunk_cat_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__upsample_bilinear2d_aa_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_abs_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_abs_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addbmm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcdiv_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmm_decomposed_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmv_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addr_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_alias_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_alias_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_allclose_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_aminmax_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_any_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_any_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argsort_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_partial_views_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_scatter_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_2d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_3d_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bernoulli_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_block_diag_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_block_diag_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_block_diag_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bmm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_tensors_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cauchy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ceil_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cholesky_inverse_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_max_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_combinations_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_complex_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_constant_pad_nd_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_contiguous_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cos_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_count_nonzero_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_count_nonzero_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumprod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumprod_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_scatter_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_scatter_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_floor_rounding_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_no_rounding_mode_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_trunc_rounding_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dot_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dsplit_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dsplit_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dsplit_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dsplit_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_einsum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_permuted_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_equal_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erf_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erf_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfc_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_as_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_float8_e4m3fnuz, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftshift_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flip_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flip_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fliplr_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_power_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_like_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ge_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geometric_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geometric_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gradient_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_grid_sampler_2d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hash_tensor_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hstack_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hstack_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_add_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_fill_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_mean_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_prod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_prod_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_inner_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_int_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isfinite_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isnan_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isneginf_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isposinf_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_unary_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_unary_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kthvalue_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kthvalue_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ldexp_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_le_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cholesky_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cholesky_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cross_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cross_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigh_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigvals_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigvals_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigvalsh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_rank_hermitian_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_rank_hermitian_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_slogdet_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_solve_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vander_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vander_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vecdot_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log1p_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_with_dtype_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_or_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_or_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_xor_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_xor_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_tensor_overload_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logsumexp_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logsumexp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logsumexp_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mT_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mT_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mT_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logsumexp_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_median_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_scatter_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_std_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_std_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_matrix_exp_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_binary_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_binary_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_no_dim_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_no_dim_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_with_dim_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mean_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_median_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_median_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_binary_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_with_dim_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_with_dim_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_minimum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_minimum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mode_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mul_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nan_to_num_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_native_dropout_backward_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_full_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_zeros_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_zeros_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nextafter_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_alpha_dropout_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_avg_pool1d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_avg_pool2d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_binary_cross_entropy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_celu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose3d_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_embedding_bag_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_gaussian_nll_loss_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_grid_sample_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardsigmoid_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_linear_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_nearest-exact_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_nearest-exact_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_kl_div_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_leaky_relu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_local_response_norm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_logsigmoid_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_mish_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_mish_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_mse_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multi_margin_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_constant_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_constant_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_poisson_nll_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_rrelu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_rrelu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_soft_margin_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_bilinear_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_bilinear_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_inf_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_inf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_in_place_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_outer_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pca_lowrank_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_prod_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_quantile_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rad2deg_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rad2deg_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rad2deg_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ravel_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ravel_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reciprocal_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_remainder_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize__cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_roll_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsqrt_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsub_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_add_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_sum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_searchsorted_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_short_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_hann_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signbit_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_with_dtype_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sparse_mm_reduce_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_airy_ai_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j1_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j1_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_entr_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1e_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i1_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtri_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtri_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_zeta_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_list_args_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stack_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stack_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_svd_lowrank_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tanh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensor_split_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensordot_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensordot_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tensordot_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tile_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_sparse_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trunc_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trunc_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_unbiased_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_complex_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_as_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_where_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick__softmax_backward_data_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_acosh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_addcdiv_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_addcdiv_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_addcdiv_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_addmv_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_addr_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_amin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_amin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_arange_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_as_strided_scatter_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_asin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_atanh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_bernoulli_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_or_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_cauchy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_ceil_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_max_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_constant_pad_nd_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_copysign_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_logaddexp2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_nn_functional_max_unpool3d_grad_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_norm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_special_xlog1py_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_squeeze_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cosh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_cumsum_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_deg2rad_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_diag_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_diag_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_diag_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_digamma_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_div_floor_rounding_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_div_floor_rounding_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_div_trunc_rounding_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_div_trunc_rounding_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_dot_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_empty_like_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_empty_strided_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_empty_strided_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_erf_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_erfc_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_erfc_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_erfinv_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_exp_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_exp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_expand_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft2_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft2_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft2_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft2_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_flip_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_full_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_gcd_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_ge_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_geometric_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_geometric_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_grid_sampler_2d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_gt_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_i0_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_i0_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_i0_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_index_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_index_fill_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_isposinf_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_lcm_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_diagonal_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_vector_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_vector_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_tensor_overload_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_tensor_overload_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_tensor_overload_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_log1p_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_log_normal_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_logical_not_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_logical_or_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_logical_xor_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_logit_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_logit_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_tensor_overload_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_lt_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_maximum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_mean_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_mv_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nan_to_num_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_narrow_copy_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_new_full_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_binary_cross_entropy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_elu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_glu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardsigmoid_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_max_unpool3d_grad_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_mish_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_mish_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_silu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_unfold_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_normal_in_place_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_normal_in_place_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_permute_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_reciprocal_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_rot90_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_round_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_round_decimals_neg_3_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_sgn_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_signbit_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_sinc_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_slice_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_special_i0e_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtri_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_special_zeta_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_split_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_split_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_std_mean_unbiased_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_sum_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_t_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_t_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_t_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_take_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_take_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_tril_indices_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_triu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_var_mean_unbiased_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_var_unbiased_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_int8, test/test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_RNN_train_mode_cuda_float64, test/test_decomp.py::DecompOneOffTestsCUDA::test_contiguous_log_softmax_cuda 2025-08-14T22:44:00.6417665Z 2025-08-14T22:44:04.8303993Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:44:04.8305418Z import pkg_resources 2025-08-14T22:44:05.2149077Z Running test_fx_reinplace_pass 1/1 ... [2025-08-14 22:44:05.214305] 2025-08-14T22:44:05.2149648Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:44:05.2151044Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_fx_reinplace_pass.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:44:05.214748] 2025-08-14T22:44:09.8393987Z 2025-08-14T22:44:09.8394859Z test_fx_reinplace_pass 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_fx_reinplace_pass_1.1_69b70ee01497cd87_.log 2025-08-14T22:44:09.8399364Z Running 12 items in this shard: test/test_fx_reinplace_pass.py::TestReinplacePass::test_out_node_updated, test/test_fx_reinplace_pass.py::TestReinplacePass::test_reinplace_basic, test/test_fx_reinplace_pass.py::TestReinplacePass::test_reinplace_different_metadata, test/test_fx_reinplace_pass.py::TestReinplacePass::test_reinplace_index_mutation, test/test_fx_reinplace_pass.py::TestReinplacePass::test_reinplace_overlapping_memory, test/test_fx_reinplace_pass.py::TestReinplacePass::test_reinplace_scatter_op, test/test_fx_reinplace_pass.py::TestReinplacePass::test_reinplace_scatter_twice, test/test_fx_reinplace_pass.py::TestReinplacePass::test_reinplace_scatter_twice_with_different_view_op_invalid, test/test_fx_reinplace_pass.py::TestReinplacePass::test_reinplace_scatter_twice_with_different_view_op_invalid2, test/test_fx_reinplace_pass.py::TestReinplacePass::test_reinplace_scatter_twice_with_different_view_op_valid, test/test_fx_reinplace_pass.py::TestReinplacePass::test_reinplace_sym_input, test/test_fx_reinplace_pass.py::TestReinplacePass::test_reinplace_with_view 2025-08-14T22:44:09.8403448Z 2025-08-14T22:44:14.0362724Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:44:14.0364059Z import pkg_resources 2025-08-14T22:44:14.4219829Z Running nn/test_multihead_attention 1/1 ... [2025-08-14 22:44:14.421367] 2025-08-14T22:44:14.4220269Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:44:14.4221296Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'nn/test_multihead_attention.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:44:14.421787] 2025-08-14T22:44:16.8219216Z 2025-08-14T22:44:16.8220402Z test_ops_jit 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_jit_1.1_7e33795140610277_.log 2025-08-14T22:44:16.8576364Z Running 1138 items in this shard: test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_abs_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_acos_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_acosh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_asin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_asinh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_atan2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_atan_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_atanh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_cat_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_clamp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_digamma_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_div_floor_rounding_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_div_no_rounding_mode_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_div_trunc_rounding_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_erf_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_erfc_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_erfinv_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_exp2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_expm1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_ge_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_gt_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_i0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_igamma_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_igammac_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_le_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_lgamma_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_linalg_det_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_linalg_householder_product_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_linalg_inv_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_linalg_matrix_power_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_log1p_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_log_softmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_log_softmax_with_dtype_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_logit_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_logsumexp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_lt_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_mH_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_matmul_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_matrix_exp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_max_binary_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_min_binary_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_movedim_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_mul_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_ne_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_neg_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_conv1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_conv2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_conv3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_conv_transpose1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_conv_transpose2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_conv_transpose3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_group_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_layer_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_rms_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_outer_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_round_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_round_decimals_0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_round_decimals_3_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_round_decimals_neg_3_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_sigmoid_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_sinc_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_softmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_softmax_with_dtype_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_sub_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_tanh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_transpose_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_trunc_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_vstack_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_xlogy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_H_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_H_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_T_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_T_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___getitem___cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___getitem___cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___radd___cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___radd___cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rdiv___cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rdiv___cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rmatmul___cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rmatmul___cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rmod___cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rmul___cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rmul___cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rpow___cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rpow___cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rsub___cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rsub___cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__batch_norm_with_update_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__chunk_cat_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__chunk_cat_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__native_batch_norm_legit_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__segment_reduce_lengths_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__segment_reduce_offsets_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__softmax_backward_data_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__unsafe_masked_index_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__unsafe_masked_index_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__upsample_bilinear2d_aa_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_abs_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_abs_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_acos_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_acos_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_acosh_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_acosh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_add_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_add_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addbmm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addbmm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addcdiv_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addcdiv_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addcmul_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addcmul_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addmm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addmm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addmm_decomposed_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addmm_decomposed_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addmv_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addmv_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addr_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addr_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_alias_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_alias_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_all_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_all_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_allclose_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_allclose_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_amax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_amin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_aminmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_angle_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_angle_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_any_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_any_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_arange_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_argmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_argmin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_argsort_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_argwhere_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_argwhere_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_as_strided_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_as_strided_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_as_strided_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_as_strided_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_as_strided_partial_views_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_as_strided_partial_views_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_as_strided_scatter_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_as_strided_scatter_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_asin_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_asin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_asinh_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_asinh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atan2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atan_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atan_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atanh_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atanh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atleast_1d_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atleast_1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atleast_2d_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atleast_2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atleast_3d_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atleast_3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_baddbmm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_baddbmm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_bernoulli_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_bfloat16_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_bfloat16_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_block_diag_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_block_diag_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_bmm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_bmm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_bool_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_bool_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_broadcast_shapes_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_broadcast_tensors_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_broadcast_tensors_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_broadcast_to_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_broadcast_to_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_bucketize_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_byte_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_byte_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cartesian_prod_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cartesian_prod_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cat_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cat_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cauchy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cdist_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cdouble_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cdouble_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ceil_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cfloat_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cfloat_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_chalf_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_chalf_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_char_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_char_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cholesky_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cholesky_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cholesky_inverse_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cholesky_inverse_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cholesky_solve_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cholesky_solve_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_chunk_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_chunk_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_clamp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_clamp_max_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_clamp_min_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_clone_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_clone_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_column_stack_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_column_stack_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_combinations_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_combinations_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_complex_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_conj_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_conj_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_conj_physical_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_conj_physical_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_constant_pad_nd_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_constant_pad_nd_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_contiguous_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_contiguous_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_copysign_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_corrcoef_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_corrcoef_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cos_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cos_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cosh_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cosh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_count_nonzero_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_count_nonzero_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cov_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cov_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cross_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cross_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cummax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cummin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cumprod_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cumprod_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cumsum_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cumsum_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cumulative_trapezoid_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cumulative_trapezoid_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_deg2rad_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diag_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diag_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diag_embed_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diag_embed_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diagflat_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diagflat_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diagonal_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diagonal_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diagonal_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diagonal_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diagonal_scatter_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diagonal_scatter_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diff_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diff_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_digamma_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_dist_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_dist_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_div_floor_rounding_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_div_no_rounding_mode_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_div_no_rounding_mode_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_div_trunc_rounding_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_dot_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_dot_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_double_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_double_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_dsplit_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_dsplit_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_dstack_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_dstack_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_einsum_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_einsum_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_empty_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_empty_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_empty_like_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_empty_like_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_empty_permuted_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_empty_permuted_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_empty_strided_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_empty_strided_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_eq_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_eq_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_equal_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_equal_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_erf_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_erfc_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_erfinv_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_exp2_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_exp2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_exp_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_exp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_expand_as_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_expand_as_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_expand_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_expand_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_expand_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_expand_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_expm1_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_expm1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_exponential_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_eye_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_eye_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_fft2_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_fft2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_fft_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_fft_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_fftn_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_fftn_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_fftshift_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_fftshift_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_hfft2_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_hfft2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_hfft_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_hfft_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_hfftn_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_hfftn_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ifft2_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ifft2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ifft_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ifft_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ifftn_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ifftn_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ifftshift_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ifftshift_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ihfft2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ihfft_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ihfftn_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_irfft2_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_irfft2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_irfft_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_irfft_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_irfftn_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_irfftn_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_rfft2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_rfft_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_rfftn_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fill_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fill_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_flatten_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_flatten_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_flip_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_flip_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fliplr_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fliplr_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_flipud_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_flipud_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_float_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_float_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_float_power_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_float_power_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_floor_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_floor_divide_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fmin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fmod_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_frac_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_frexp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_full_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_full_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_full_like_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_full_like_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_gather_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_gather_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ge_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_geometric_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_geqrf_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_geqrf_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_gradient_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_gradient_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_grid_sampler_2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_gt_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_half_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_half_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_hash_tensor_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_heaviside_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_histc_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_hsplit_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_hsplit_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_hstack_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_hstack_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_hypot_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_i0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_igamma_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_igammac_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_imag_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_add_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_add_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_fill_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_fill_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_put_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_put_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_reduce_amax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_reduce_amin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_reduce_mean_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_reduce_prod_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_select_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_select_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_inner_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_inner_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_int_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_int_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isclose_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isclose_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isfinite_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isfinite_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isinf_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isinf_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isnan_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isnan_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isneginf_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isposinf_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isreal_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isreal_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_istft_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_item_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_item_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_2inputs_2outputs_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_2inputs_2outputs_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_binary_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_binary_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_binary_return_by_ref_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_binary_return_by_ref_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_unary_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_unary_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_kron_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_kron_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_kthvalue_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ldexp_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ldexp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_le_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lerp_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lerp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lgamma_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_cholesky_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_cholesky_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_cholesky_ex_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_cholesky_ex_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_cond_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_cond_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_cross_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_cross_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_det_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_det_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_diagonal_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_diagonal_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_eig_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_eig_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_eigh_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_eigh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_eigvals_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_eigvals_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_eigvalsh_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_eigvalsh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_householder_product_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_householder_product_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_inv_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_inv_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_inv_ex_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_inv_ex_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_ldl_factor_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_ldl_factor_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_ldl_factor_ex_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_ldl_factor_ex_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_ldl_solve_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_ldl_solve_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lstsq_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lstsq_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lstsq_grad_oriented_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_factor_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_factor_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_factor_ex_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_factor_ex_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_solve_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_solve_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_matrix_norm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_matrix_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_matrix_power_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_matrix_power_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_matrix_rank_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_matrix_rank_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_matrix_rank_hermitian_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_multi_dot_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_multi_dot_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_norm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_pinv_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_pinv_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_pinv_hermitian_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_pinv_hermitian_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_pinv_singular_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_pinv_singular_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_qr_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_qr_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_slogdet_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_slogdet_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_solve_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_solve_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_solve_ex_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_solve_ex_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_solve_triangular_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_solve_triangular_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_svd_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_svd_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_svdvals_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_svdvals_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_tensorinv_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_tensorinv_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_tensorsolve_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_tensorsolve_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_vander_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_vander_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_vecdot_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_vecdot_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_vector_norm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_vector_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linspace_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linspace_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linspace_tensor_overload_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linspace_tensor_overload_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log10_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log10_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log1p_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log1p_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log2_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log_normal_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log_softmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log_softmax_with_dtype_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log_softmax_with_dtype_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logaddexp2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logaddexp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logcumsumexp_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logcumsumexp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logdet_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logdet_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logical_and_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logical_and_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logical_not_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logical_not_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logical_or_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logical_or_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logical_xor_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logical_xor_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logit_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logspace_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logspace_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logspace_tensor_overload_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logspace_tensor_overload_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logsumexp_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logsumexp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_long_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_long_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lt_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lu_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lu_solve_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lu_solve_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lu_unpack_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lu_unpack_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mH_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mH_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mT_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mT_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_amax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_amin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_argmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_argmin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_cumprod_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_cumprod_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_cumsum_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_cumsum_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_fill_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_fill_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_log_softmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_logaddexp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_logsumexp_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_logsumexp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_mean_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_mean_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_median_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_normalize_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_normalize_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_prod_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_prod_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_scatter_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_scatter_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_select_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_select_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_softmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_softmin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_std_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_std_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_sum_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_sum_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_var_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_var_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_matmul_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_matmul_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_matrix_exp_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_matrix_exp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_max_binary_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_max_pool2d_with_indices_backward_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_max_reduction_no_dim_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_max_reduction_with_dim_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_maximum_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mean_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mean_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_median_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_meshgrid_list_of_tensors_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_meshgrid_list_of_tensors_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_meshgrid_variadic_tensors_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_meshgrid_variadic_tensors_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_min_binary_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_min_reduction_no_dim_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_min_reduction_with_dim_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_minimum_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mode_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_movedim_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_movedim_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_msort_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mul_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mul_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_multinomial_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mv_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mv_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nan_to_num_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nanmean_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nanmean_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nanmedian_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nanquantile_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nansum_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nansum_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_narrow_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_narrow_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_narrow_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_narrow_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_native_batch_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_native_dropout_backward_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_native_layer_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ne_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ne_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_neg_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_neg_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_empty_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_empty_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_empty_strided_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_empty_strided_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_full_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_full_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_ones_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_ones_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_zeros_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_zeros_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nextafter_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_alpha_dropout_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_avg_pool1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_avg_pool2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_avg_pool3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_batch_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_bilinear_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_celu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_channel_shuffle_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_channel_shuffle_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv1d_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv2d_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv3d_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv_transpose1d_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv_transpose1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv_transpose2d_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv_transpose2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv_transpose3d_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv_transpose3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_cosine_similarity_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_cross_entropy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_ctc_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_dropout2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_dropout3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_dropout_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_elu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_embedding_bag_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_embedding_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_gelu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_glu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_grid_sample_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_group_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_hardshrink_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_hardsigmoid_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_hardswish_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_hardtanh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_huber_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_instance_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_interpolate_area_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_interpolate_linear_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_interpolate_nearest_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_kl_div_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_l1_loss_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_l1_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_layer_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_leaky_relu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_linear_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_linear_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_local_response_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_logsigmoid_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_max_pool1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_max_pool2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_max_pool3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_max_unpool1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_max_unpool2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_max_unpool3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_mish_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_mse_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_multi_margin_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_nll_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_normalize_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_normalize_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_circular_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_circular_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_constant_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_constant_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_reflect_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_reflect_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_replicate_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_replicate_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_replicate_negative_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pairwise_distance_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pairwise_distance_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pdist_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pixel_shuffle_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pixel_shuffle_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pixel_unshuffle_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_prelu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_relu6_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_relu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_rms_norm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_rms_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_rrelu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_selu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_silu_complex_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_silu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_soft_margin_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_softmin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_softmin_with_dtype_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_softplus_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_softshrink_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_softsign_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_softsign_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_tanhshrink_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_tanhshrink_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_threshold_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_triplet_margin_loss_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_unfold_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_unfold_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_upsample_bilinear_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_upsample_nearest_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nonzero_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nonzero_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nonzero_static_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nonzero_static_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_norm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_norm_fro_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_norm_fro_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_norm_inf_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_norm_inf_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_norm_nuc_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_norm_nuc_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_normal_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_normal_in_place_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_normal_in_place_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_normal_number_mean_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ones_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ones_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ones_like_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ones_like_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ormqr_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ormqr_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_outer_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_outer_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_pca_lowrank_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_pca_lowrank_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_permute_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_permute_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_permute_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_permute_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_pinverse_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_pinverse_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_polar_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_polygamma_polygamma_n_0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_polygamma_polygamma_n_1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_polygamma_polygamma_n_2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_polygamma_polygamma_n_3_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_polygamma_polygamma_n_4_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_positive_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_positive_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_pow_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_pow_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_prod_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_prod_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_put_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_put_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_qr_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_qr_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_quantile_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rad2deg_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rand_like_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rand_like_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_randint_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_randint_like_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_randn_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_randn_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_randn_like_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_randn_like_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ravel_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ravel_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_real_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_real_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_reciprocal_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_reciprocal_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_remainder_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_renorm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_renorm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_repeat_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_repeat_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_repeat_interleave_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_repeat_interleave_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_reshape_as_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_reshape_as_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_reshape_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_reshape_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_resize__cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_resize__cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_resize_as__cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_resize_as__cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_resolve_conj_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_resolve_conj_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_resolve_neg_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_resolve_neg_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_roll_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_roll_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rot90_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rot90_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_round_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_round_decimals_0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_round_decimals_3_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_round_decimals_neg_3_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rsqrt_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rsqrt_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rsub_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rsub_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scalar_tensor_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scalar_tensor_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scatter_add_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scatter_add_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scatter_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scatter_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scatter_reduce_amax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scatter_reduce_amin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scatter_reduce_mean_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scatter_reduce_prod_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scatter_reduce_sum_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_searchsorted_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_select_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_select_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_select_scatter_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sgn_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sgn_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_short_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_short_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sigmoid_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sigmoid_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sign_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_bartlett_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_blackman_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_cosine_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_exponential_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_gaussian_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_general_cosine_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_general_hamming_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_hamming_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_hann_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_kaiser_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_nuttall_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signbit_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sin_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sinc_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sinc_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sinh_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sinh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_slice_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_slice_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_slice_scatter_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_softmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_softmax_with_dtype_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_softmax_with_dtype_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sort_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sparse_mm_reduce_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sparse_sampled_addmm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sparse_sampled_addmm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_airy_ai_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_bessel_j0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_bessel_j1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_bessel_y0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_bessel_y1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_chebyshev_polynomial_t_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_chebyshev_polynomial_u_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_chebyshev_polynomial_v_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_chebyshev_polynomial_w_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_entr_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_erfcx_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_hermite_polynomial_h_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_hermite_polynomial_he_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_i0e_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_i1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_i1e_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_laguerre_polynomial_l_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_legendre_polynomial_p_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_log_ndtr_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_modified_bessel_i0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_modified_bessel_i1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_modified_bessel_k0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_modified_bessel_k1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_ndtr_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_ndtri_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_spherical_bessel_j0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_xlog1py_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_zeta_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_split_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_split_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_split_list_args_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_split_list_args_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_split_with_sizes_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_split_with_sizes_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_split_with_sizes_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_split_with_sizes_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sqrt_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sqrt_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_square_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_square_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_squeeze_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_squeeze_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_squeeze_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_squeeze_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_squeeze_multiple_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_squeeze_multiple_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_stack_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_stack_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_std_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_std_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_std_mean_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_std_mean_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_std_mean_unbiased_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_std_mean_unbiased_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_std_unbiased_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_std_unbiased_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_stft_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_stft_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sub_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sub_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sum_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sum_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sum_to_size_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sum_to_size_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_svd_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_svd_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_svd_lowrank_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_svd_lowrank_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_t_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_t_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_t_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_t_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_take_along_dim_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_take_along_dim_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_take_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_take_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tan_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tan_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tanh_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tanh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tensor_split_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tensor_split_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tensordot_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tensordot_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tile_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tile_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_to_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_to_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_to_sparse_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_to_sparse_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_topk_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_trace_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_trace_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_transpose_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_transpose_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_transpose_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_transpose_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_trapezoid_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_trapezoid_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_trapz_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_trapz_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_triangular_solve_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_triangular_solve_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tril_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tril_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_triu_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_triu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_true_divide_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_true_divide_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_trunc_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unbind_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unbind_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unbind_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unbind_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unflatten_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unflatten_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unfold_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unfold_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unfold_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unfold_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_uniform_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_uniform_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unique_consecutive_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unique_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unsafe_chunk_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unsafe_chunk_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unsafe_split_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unsafe_split_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unsqueeze_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unsqueeze_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unsqueeze_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unsqueeze_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_var_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_var_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_var_mean_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_var_mean_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_var_mean_unbiased_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_var_mean_unbiased_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_var_unbiased_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_var_unbiased_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_vdot_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_vdot_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_view_as_complex_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_view_as_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_view_as_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_view_as_real_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_view_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_view_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_view_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_view_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_vsplit_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_vsplit_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_vstack_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_vstack_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_where_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_where_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_xlogy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_zero__cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_zero__cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_zeros_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_zeros_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_zeros_like_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_zeros_like_cuda_float32 2025-08-14T22:44:16.8918469Z 2025-08-14T22:44:19.1463467Z 2025-08-14T22:44:19.1464506Z nn/test_multihead_attention 1/1 was successful, full logs can be found in artifacts with path test/test-reports/nn.test_multihead_attention_1.1_5246fe959cdeb7ed_.log 2025-08-14T22:44:19.1474638Z Running 19 items in this shard: test/nn/test_multihead_attention.py::TestMultiheadAttentionNN::test_multihead_attention_average_attn_weights_False, test/nn/test_multihead_attention.py::TestMultiheadAttentionNN::test_multihead_attention_average_attn_weights_True, test/nn/test_multihead_attention.py::TestMultiheadAttentionNN::test_multihead_attn_3d_attn_mask, test/nn/test_multihead_attention.py::TestMultiheadAttentionNN::test_multihead_attn_fast_path_invalid_shape, test/nn/test_multihead_attention.py::TestMultiheadAttentionNN::test_multihead_attn_invalid_shape, test/nn/test_multihead_attention.py::TestMultiheadAttentionNN::test_multihead_attn_nested_tensor_outside_fast_path, test/nn/test_multihead_attention.py::TestMultiheadAttentionNN::test_multihead_attn_no_bias, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_attention_dtype_batch_first_cuda_float16, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_attention_dtype_batch_first_cuda_float32, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_attention_dtype_batch_first_cuda_float64, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_attention_dtype_cuda_float16, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_attention_dtype_cuda_float32, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_attention_dtype_cuda_float64, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_attn_fast_path_query_and_bias_have_different_dtypes_cuda_float64, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_attn_fast_path_small_test_cuda_float64, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_attn_in_proj_bias_none_cuda_float64, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_attn_in_proj_weight_none_cuda_float64, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_self_attn_two_masks_fast_path_cuda, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_self_attn_two_masks_fast_path_mock_cuda 2025-08-14T22:44:19.1483665Z 2025-08-14T22:44:21.0503389Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:44:21.0504774Z import pkg_resources 2025-08-14T22:44:21.4341366Z Running functorch/test_aot_joint_with_descriptors 1/1 ... [2025-08-14 22:44:21.433605] 2025-08-14T22:44:21.4341969Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:44:21.4344855Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_aot_joint_with_descriptors.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:44:21.434071] 2025-08-14T22:44:23.2407622Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:44:23.2408954Z import pkg_resources 2025-08-14T22:44:23.6158811Z Running test_pruning_op 1/1 ... [2025-08-14 22:44:23.615394] 2025-08-14T22:44:23.6159213Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:44:23.6161400Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_pruning_op.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:44:23.615790] 2025-08-14T22:44:26.4583878Z 2025-08-14T22:44:26.4585187Z functorch/test_aot_joint_with_descriptors 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_aot_joint_with_descriptors_1.1_4957fdb80114d9cd_.log 2025-08-14T22:44:26.4592859Z Running 10 items in this shard: test/functorch/test_aot_joint_with_descriptors.py::TestAOTJointWithDescriptors::test_conv_bn_module, test/functorch/test_aot_joint_with_descriptors.py::TestAOTJointWithDescriptors::test_export_and_compile, test/functorch/test_aot_joint_with_descriptors.py::TestAOTJointWithDescriptors::test_fx_utils_conv_bn_module, test/functorch/test_aot_joint_with_descriptors.py::TestAOTJointWithDescriptors::test_fx_utils_multiple_outputs, test/functorch/test_aot_joint_with_descriptors.py::TestAOTJointWithDescriptors::test_fx_utils_node_consistency, test/functorch/test_aot_joint_with_descriptors.py::TestAOTJointWithDescriptors::test_fx_utils_simple_linear, test/functorch/test_aot_joint_with_descriptors.py::TestAOTJointWithDescriptors::test_in_out_specs, test/functorch/test_aot_joint_with_descriptors.py::TestAOTJointWithDescriptors::test_module_with_kwargs, test/functorch/test_aot_joint_with_descriptors.py::TestAOTJointWithDescriptors::test_multiple_outputs_module, test/functorch/test_aot_joint_with_descriptors.py::TestAOTJointWithDescriptors::test_simple_linear_module 2025-08-14T22:44:26.4599759Z 2025-08-14T22:44:28.2897172Z 2025-08-14T22:44:28.2898424Z test_pruning_op 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_pruning_op_1.1_cdbfd65a1ffeb368_.log 2025-08-14T22:44:28.2899581Z Running 2 items in this shard: test/test_pruning_op.py::PruningOpTest::test_rowwise_prune_op_32bit_indices, test/test_pruning_op.py::PruningOpTest::test_rowwise_prune_op_64bit_indices 2025-08-14T22:44:28.2900247Z 2025-08-14T22:44:30.6246313Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:44:30.6247825Z import pkg_resources 2025-08-14T22:44:31.0114016Z Running lazy/test_functionalization 1/1 ... [2025-08-14 22:44:31.010824] 2025-08-14T22:44:31.0114624Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:44:31.0117164Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'lazy/test_functionalization.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:44:31.011278] 2025-08-14T22:44:32.4105158Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:44:32.4106479Z import pkg_resources 2025-08-14T22:44:32.8066130Z Running functorch/test_ac 1/1 ... [2025-08-14 22:44:32.806063] 2025-08-14T22:44:32.8066716Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:44:32.8067849Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_ac.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:44:32.806461] 2025-08-14T22:44:35.6351820Z 2025-08-14T22:44:35.6353205Z lazy/test_functionalization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/lazy.test_functionalization_1.1_04badc742a9adb55_.log 2025-08-14T22:44:35.6355617Z Running 2 items in this shard: test/lazy/test_functionalization.py::LazyFuncionalizationTest::test_data_assign, test/lazy/test_functionalization.py::LazyFuncionalizationTest::test_lazy_init_with_view 2025-08-14T22:44:35.6357018Z 2025-08-14T22:44:39.7304867Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:44:39.7306205Z import pkg_resources 2025-08-14T22:44:40.1093516Z Running test_ops_fwd_gradients 1/1 ... [2025-08-14 22:44:40.108818] 2025-08-14T22:44:40.1094186Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:44:40.1096190Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops_fwd_gradients.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:44:40.109215] 2025-08-14T22:44:40.2362369Z 2025-08-14T22:44:40.2363378Z functorch/test_ac 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_ac_1.1_84e4bd3b5e506236_.log 2025-08-14T22:44:40.2367049Z Running 9 items in this shard: test/functorch/test_ac.py::MemoryBudgetTest::test_attention_vs_linear, test/functorch/test_ac.py::MemoryBudgetTest::test_custom_triton_kernel, test/functorch/test_ac.py::MemoryBudgetTest::test_manual_ac, test/functorch/test_ac.py::MemoryBudgetTest::test_matmul_even_chain, test/functorch/test_ac.py::MemoryBudgetTest::test_matmul_uneven_chain, test/functorch/test_ac.py::MemoryBudgetTest::test_prioritize_cheaper_matmul, test/functorch/test_ac.py::MemoryBudgetTest::test_prioritize_cheaper_matmul2, test/functorch/test_ac.py::MemoryBudgetTest::test_profile, test/functorch/test_ac.py::MemoryBudgetTest::test_rematerializes_cheap 2025-08-14T22:44:40.2369530Z 2025-08-14T22:44:44.3518833Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:44:44.3520989Z import pkg_resources 2025-08-14T22:44:44.7340153Z Running test_hub 1/1 ... [2025-08-14 22:44:44.733493] 2025-08-14T22:44:44.7340571Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:44:44.7342726Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_hub.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:44:44.733898] 2025-08-14T22:44:49.4577196Z 2025-08-14T22:44:49.4577970Z test_hub 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_hub_1.1_297d669373694752_.log 2025-08-14T22:44:49.4582296Z Running 20 items in this shard: test/test_hub.py::TestHub::test_download_url_to_file, test/test_hub.py::TestHub::test_get_set_dir, test/test_hub.py::TestHub::test_hub_parse_repo_info, test/test_hub.py::TestHub::test_list_entrypoints, test/test_hub.py::TestHub::test_load_commit_from_forked_repo, test/test_hub.py::TestHub::test_load_from_branch, test/test_hub.py::TestHub::test_load_from_github, test/test_hub.py::TestHub::test_load_from_local_dir, test/test_hub.py::TestHub::test_load_legacy_zip_checkpoint, test/test_hub.py::TestHub::test_load_state_dict_from_url, test/test_hub.py::TestHub::test_load_zip_1_6_checkpoint, test/test_hub.py::TestHub::test_trust_repo_builtin_trusted_owners, test/test_hub.py::TestHub::test_trust_repo_check_no, test/test_hub.py::TestHub::test_trust_repo_check_yes, test/test_hub.py::TestHub::test_trust_repo_false_emptystring, test/test_hub.py::TestHub::test_trust_repo_false_no, test/test_hub.py::TestHub::test_trust_repo_legacy, test/test_hub.py::TestHub::test_trust_repo_none, test/test_hub.py::TestHub::test_trust_repo_true, test/test_hub.py::TestHub::test_trusted_repo_false_yes 2025-08-14T22:44:49.4586422Z 2025-08-14T22:44:53.5847397Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:44:53.5848735Z import pkg_resources 2025-08-14T22:44:53.9823093Z Running test_set_default_mobile_cpu_allocator 1/1 ... [2025-08-14 22:44:53.981717] 2025-08-14T22:44:53.9823663Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:44:53.9826160Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_set_default_mobile_cpu_allocator.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:44:53.982115] 2025-08-14T22:44:58.6063089Z 2025-08-14T22:44:58.6063942Z test_set_default_mobile_cpu_allocator 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_set_default_mobile_cpu_allocator_1.1_25b665f9c9fca39a_.log 2025-08-14T22:44:58.6065733Z Running 2 items in this shard: test/test_set_default_mobile_cpu_allocator.py::TestSetDefaultMobileCPUAllocator::test_exception, test/test_set_default_mobile_cpu_allocator.py::TestSetDefaultMobileCPUAllocator::test_no_exception 2025-08-14T22:44:58.6066774Z 2025-08-14T22:45:02.7645363Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:45:02.7646684Z import pkg_resources 2025-08-14T22:45:03.1449024Z Running test_cuda_expandable_segments 1/1 ... [2025-08-14 22:45:03.144346] 2025-08-14T22:45:03.1449621Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:45:03.1452294Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_cuda_expandable_segments.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:45:03.144798] 2025-08-14T22:45:22.7471714Z 2025-08-14T22:45:22.7472741Z test_cuda_expandable_segments 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_cuda_expandable_segments_1.1_225fc33edbc4dfb9_.log 2025-08-14T22:45:22.7516528Z Running 148 items in this shard: test/test_cuda_expandable_segments.py::TestBlockStateAbsorption::test_additional_free_following_checkpoint, test/test_cuda_expandable_segments.py::TestBlockStateAbsorption::test_allocate_in_thread_to_pool, test/test_cuda_expandable_segments.py::TestBlockStateAbsorption::test_allocated_in_middle_of_segment, test/test_cuda_expandable_segments.py::TestBlockStateAbsorption::test_assigning_back_deleter_fns_to_tensor, test/test_cuda_expandable_segments.py::TestBlockStateAbsorption::test_check_pool_live_allocations, test/test_cuda_expandable_segments.py::TestBlockStateAbsorption::test_middle_allocations_contiguous, test/test_cuda_expandable_segments.py::TestBlockStateAbsorption::test_multiple_middle_allocations, test/test_cuda_expandable_segments.py::TestBlockStateAbsorption::test_no_triton_on_import, test/test_cuda_expandable_segments.py::TestBlockStateAbsorption::test_resnet, test/test_cuda_expandable_segments.py::TestBlockStateAbsorption::test_simple, test/test_cuda_expandable_segments.py::TestBlockStateAbsorption::test_tensor_dies_after_checkpoint, test/test_cuda_expandable_segments.py::TestCuda::test_arithmetic_large_tensor, test/test_cuda_expandable_segments.py::TestCuda::test_batch_norm_gather_stats, test/test_cuda_expandable_segments.py::TestCuda::test_bincount_ext, test/test_cuda_expandable_segments.py::TestCuda::test_caching_allocator_record_stream_oom, test/test_cuda_expandable_segments.py::TestCuda::test_caching_pinned_memory, test/test_cuda_expandable_segments.py::TestCuda::test_check_error, test/test_cuda_expandable_segments.py::TestCuda::test_copy_non_blocking, test/test_cuda_expandable_segments.py::TestCuda::test_copy_non_blocking_type_conversion, test/test_cuda_expandable_segments.py::TestCuda::test_cublas_allow_bf16_reduced_precision_reduction_get_set, test/test_cuda_expandable_segments.py::TestCuda::test_cublas_allow_fp16_accumulation_get_set, test/test_cuda_expandable_segments.py::TestCuda::test_cublas_allow_fp16_reduced_precision_reduction_get_set, test/test_cuda_expandable_segments.py::TestCuda::test_cublas_allow_tf32_get_set, test/test_cuda_expandable_segments.py::TestCuda::test_cublas_multiple_threads_same_device, test/test_cuda_expandable_segments.py::TestCuda::test_cublas_workspace_explicit_allocation, test/test_cuda_expandable_segments.py::TestCuda::test_cuda_get_device_capability, test/test_cuda_expandable_segments.py::TestCuda::test_cuda_get_device_name, test/test_cuda_expandable_segments.py::TestCuda::test_cuda_get_device_properties, test/test_cuda_expandable_segments.py::TestCuda::test_cuda_graph_allocator_propagates_stream, test/test_cuda_expandable_segments.py::TestCuda::test_cuda_graph_error_options, test/test_cuda_expandable_segments.py::TestCuda::test_cuda_graph_raw_graph, test/test_cuda_expandable_segments.py::TestCuda::test_cuda_graph_raw_graph_keep_graph_false, test/test_cuda_expandable_segments.py::TestCuda::test_cuda_graph_raw_graph_reset_and_recapture, test/test_cuda_expandable_segments.py::TestCuda::test_cuda_graph_tensor_item_not_allowed, test/test_cuda_expandable_segments.py::TestCuda::test_cuda_memory_leak_detection_propagates_errors, test/test_cuda_expandable_segments.py::TestCuda::test_cudart_register, test/test_cuda_expandable_segments.py::TestCuda::test_cudnn_allow_tf32_get_set, test/test_cuda_expandable_segments.py::TestCuda::test_cudnn_multiple_threads_same_device, test/test_cuda_expandable_segments.py::TestCuda::test_cusparse_multiple_threads_same_device, test/test_cuda_expandable_segments.py::TestCuda::test_device_context_manager, test/test_cuda_expandable_segments.py::TestCuda::test_device_count_not_cached_pre_init, test/test_cuda_expandable_segments.py::TestCuda::test_events, test/test_cuda_expandable_segments.py::TestCuda::test_events_elapsedtime, test/test_cuda_expandable_segments.py::TestCuda::test_fixed_cuda_assert_async, test/test_cuda_expandable_segments.py::TestCuda::test_float32_matmul_precision_get_set, test/test_cuda_expandable_segments.py::TestCuda::test_fp32_precision_with_float32_matmul_precision, test/test_cuda_expandable_segments.py::TestCuda::test_fp32_precision_with_tf32, test/test_cuda_expandable_segments.py::TestCuda::test_gather_bool, test/test_cuda_expandable_segments.py::TestCuda::test_gds_fails_in_ci, test/test_cuda_expandable_segments.py::TestCuda::test_generic_stream_event, test/test_cuda_expandable_segments.py::TestCuda::test_get_device_index, test/test_cuda_expandable_segments.py::TestCuda::test_graph_capture_oom, test/test_cuda_expandable_segments.py::TestCuda::test_graph_capture_reset_recapture, test/test_cuda_expandable_segments.py::TestCuda::test_graph_capture_simple, test/test_cuda_expandable_segments.py::TestCuda::test_graph_concurrent_replay, test/test_cuda_expandable_segments.py::TestCuda::test_graph_debugdump, test/test_cuda_expandable_segments.py::TestCuda::test_graph_error, test/test_cuda_expandable_segments.py::TestCuda::test_graph_is_current_stream_capturing, test/test_cuda_expandable_segments.py::TestCuda::test_graph_make_graphed_callables_same_pool, test/test_cuda_expandable_segments.py::TestCuda::test_graph_memory_stats_and_use_result_after_destroy_graph, test/test_cuda_expandable_segments.py::TestCuda::test_graph_optims_with_explicitly_capturable_param_groups, test/test_cuda_expandable_segments.py::TestCuda::test_graph_record_stream, test/test_cuda_expandable_segments.py::TestCuda::test_graph_rng_distributions, test/test_cuda_expandable_segments.py::TestCuda::test_graph_rng_functional, test/test_cuda_expandable_segments.py::TestCuda::test_graph_three_successive, test/test_cuda_expandable_segments.py::TestCuda::test_graph_timing, test/test_cuda_expandable_segments.py::TestCuda::test_graph_two_successive, test/test_cuda_expandable_segments.py::TestCuda::test_graph_warn_if_has_zero_nodes, test/test_cuda_expandable_segments.py::TestCuda::test_graphsafe_set_get_rng_state, test/test_cuda_expandable_segments.py::TestCuda::test_hip_device_count, test/test_cuda_expandable_segments.py::TestCuda::test_hipblaslt_allow_tf32, test/test_cuda_expandable_segments.py::TestCuda::test_index_out_of_bounds_exception_cuda, test/test_cuda_expandable_segments.py::TestCuda::test_invalid_status_for_legacy_api, test/test_cuda_expandable_segments.py::TestCuda::test_is_pinned_no_context, test/test_cuda_expandable_segments.py::TestCuda::test_lazy_init, test/test_cuda_expandable_segments.py::TestCuda::test_manual_seed, test/test_cuda_expandable_segments.py::TestCuda::test_matmul_device_mismatch, test/test_cuda_expandable_segments.py::TestCuda::test_matmul_memory_use, test/test_cuda_expandable_segments.py::TestCuda::test_mean_fp16, test/test_cuda_expandable_segments.py::TestCuda::test_memory_allocation, test/test_cuda_expandable_segments.py::TestCuda::test_memory_stats, test/test_cuda_expandable_segments.py::TestCuda::test_memory_stats_of_multiple_generators_and_graphs, test/test_cuda_expandable_segments.py::TestCuda::test_min_max_inits, test/test_cuda_expandable_segments.py::TestCuda::test_multi_device_context_manager, test/test_cuda_expandable_segments.py::TestCuda::test_multi_device_stream_context_manager, test/test_cuda_expandable_segments.py::TestCuda::test_multinomial_ext, test/test_cuda_expandable_segments.py::TestCuda::test_multinomial_invalid_probs_cuda, test/test_cuda_expandable_segments.py::TestCuda::test_noncontiguous_pinned_memory, test/test_cuda_expandable_segments.py::TestCuda::test_norm_type_conversion, test/test_cuda_expandable_segments.py::TestCuda::test_nvtx, test/test_cuda_expandable_segments.py::TestCuda::test_out_of_memory, test/test_cuda_expandable_segments.py::TestCuda::test_pinned_memory_empty_cache, test/test_cuda_expandable_segments.py::TestCuda::test_pinned_memory_use_background_threads, test/test_cuda_expandable_segments.py::TestCuda::test_pinned_memory_with_cudaregister, test/test_cuda_expandable_segments.py::TestCuda::test_pinned_memory_with_cudaregister_multithread, test/test_cuda_expandable_segments.py::TestCuda::test_preferred_blas_library_settings, test/test_cuda_expandable_segments.py::TestCuda::test_prod_large, test/test_cuda_expandable_segments.py::TestCuda::test_randint_randomness_for_large_range, test/test_cuda_expandable_segments.py::TestCuda::test_random_no_reused_random_states_float32, test/test_cuda_expandable_segments.py::TestCuda::test_random_no_reused_random_states_float64, test/test_cuda_expandable_segments.py::TestCuda::test_record_stream, test/test_cuda_expandable_segments.py::TestCuda::test_record_stream_on_shifted_view, test/test_cuda_expandable_segments.py::TestCuda::test_reduction_gpu_memory_accessing, test/test_cuda_expandable_segments.py::TestCuda::test_rocm_backward_pass_guard, test/test_cuda_expandable_segments.py::TestCuda::test_serialization_array_with_empty, test/test_cuda_expandable_segments.py::TestCuda::test_serialization_array_with_storage, test/test_cuda_expandable_segments.py::TestCuda::test_specify_improper_device_name, test/test_cuda_expandable_segments.py::TestCuda::test_stream_compatibility, test/test_cuda_expandable_segments.py::TestCuda::test_stream_context_manager, test/test_cuda_expandable_segments.py::TestCuda::test_stream_event_repr, test/test_cuda_expandable_segments.py::TestCuda::test_streaming_backwards_callback, test/test_cuda_expandable_segments.py::TestCuda::test_streaming_backwards_multiple_streams, test/test_cuda_expandable_segments.py::TestCuda::test_streaming_backwards_sync, test/test_cuda_expandable_segments.py::TestCuda::test_streaming_backwards_sync_graph_root, test/test_cuda_expandable_segments.py::TestCuda::test_streams, test/test_cuda_expandable_segments.py::TestCuda::test_sum_fp16, test/test_cuda_expandable_segments.py::TestCuda::test_tiny_half_norm_, test/test_cuda_expandable_segments.py::TestCuda::test_to_cpu_blocking_by_default, test/test_cuda_expandable_segments.py::TestCuda::test_to_numpy, test/test_cuda_expandable_segments.py::TestCuda::test_torch_manual_seed_seeds_cuda_devices, test/test_cuda_expandable_segments.py::TestCuda::test_type_conversions, test/test_cuda_expandable_segments.py::TestCuda::test_uuid, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_allocator_fuzz, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_allocator_settings, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_cachingAllocator_raw_alloc, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_clock_speed, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_cpp_memory_snapshot_pickle, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_cycles, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_device_memory_used, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_direct_traceback, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_memory_compile_regions, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_memory_plots, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_memory_plots_free_segment_stack, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_memory_plots_free_stack, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_memory_plots_history_context, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_memory_profiler_viz, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_memory_snapshot, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_memory_snapshot_script, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_memory_snapshot_with_cpp, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_notifies_oom, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_nvml_get_handler, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_power_draw, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_raises_oom_max_split_size_mb_setting_False, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_raises_oom_max_split_size_mb_setting_True, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_raw_amdsmi_device_count, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_raw_amdsmi_device_uuids, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_temperature, test/test_cuda_expandable_segments.py::TestCudaMallocAsync::test_uuid_visible_devices 2025-08-14T22:45:22.7558372Z 2025-08-14T22:45:26.7928500Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:45:26.7929813Z import pkg_resources 2025-08-14T22:45:27.1715356Z Running test_stateless 1/1 ... [2025-08-14 22:45:27.170970] 2025-08-14T22:45:27.1715898Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:45:27.1717489Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_stateless.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:45:27.171348] 2025-08-14T22:45:31.9460437Z 2025-08-14T22:45:31.9461439Z test_stateless 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_stateless_1.1_631f24fc5337fddb_.log 2025-08-14T22:45:31.9491155Z Running 50 items in this shard: test/test_stateless.py::TestStatelessFunctionalAPI::test_circular_references_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_circular_references_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_batch_norm_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_batch_norm_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_member_reference_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_member_reference_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_multiple_dicts_error, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_tuple_dicts, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_with_data_parallel_error_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_with_data_parallel_error_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_with_data_parallel_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_with_data_parallel_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_with_gradient_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_with_gradient_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_with_jit_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_with_jit_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_with_kwargs_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_with_kwargs_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_in_place_operator_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_in_place_operator_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_module_fail_reset_to_original_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_module_fail_reset_to_original_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_some_weights_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_some_weights_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_special_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_special_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_strict_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_strict_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_tie_some_weights_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_tie_some_weights_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_tie_weights_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_tie_weights_strict_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_tie_weights_strict_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_tie_weights_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrized_module_change_parametrization_original_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrized_module_change_parametrization_original_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_setattr_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_setattr_strict_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_setattr_strict_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_setattr_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_tied_weights_errors_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_tied_weights_errors_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_tied_weights_no_error_without_flag, test/test_stateless.py::TestStatelessFunctionalAPI::test_tied_weights_warns_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_tied_weights_warns_torch_func, test/test_stateless.py::TestStatelessDeprecation::test_private_stateless_warns, test/test_stateless.py::TestStatelessDeprecation::test_stateless_functional_call_warns, test/test_stateless.py::TestPythonOptimizeMode::test_runs_with_optimize_flag 2025-08-14T22:45:31.9520249Z 2025-08-14T22:45:36.0918916Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:45:36.0920373Z import pkg_resources 2025-08-14T22:45:36.4791626Z Running test_numpy_interop 1/1 ... [2025-08-14 22:45:36.478593] 2025-08-14T22:45:36.4792252Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:45:36.4794222Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_numpy_interop.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:45:36.478998] 2025-08-14T22:45:41.3532179Z 2025-08-14T22:45:41.3533203Z test_numpy_interop 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_numpy_interop_1.1_274392e8b1c71869_.log 2025-08-14T22:45:41.3548021Z Running 44 items in this shard: test/test_numpy_interop.py::TestNumPyInteropCUDA::test___eq___cuda_bool, test/test_numpy_interop.py::TestNumPyInteropCUDA::test___eq___cuda_complex128, test/test_numpy_interop.py::TestNumPyInteropCUDA::test___eq___cuda_complex64, test/test_numpy_interop.py::TestNumPyInteropCUDA::test___eq___cuda_float16, test/test_numpy_interop.py::TestNumPyInteropCUDA::test___eq___cuda_float32, test/test_numpy_interop.py::TestNumPyInteropCUDA::test___eq___cuda_float64, test/test_numpy_interop.py::TestNumPyInteropCUDA::test___eq___cuda_int16, test/test_numpy_interop.py::TestNumPyInteropCUDA::test___eq___cuda_int32, test/test_numpy_interop.py::TestNumPyInteropCUDA::test___eq___cuda_int64, test/test_numpy_interop.py::TestNumPyInteropCUDA::test___eq___cuda_int8, test/test_numpy_interop.py::TestNumPyInteropCUDA::test___eq___cuda_uint8, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_ctor_with_invalid_numpy_array_sequence_cuda, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_ctor_with_numpy_scalar_ctor_cuda, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_empty_tensors_interop_cuda, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_from_list_of_ndarray_warning_cuda, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_from_numpy_cuda, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_from_numpy_no_leak_on_invalid_dtype_cuda, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_from_numpy_zero_element_type_cuda, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_has_storage_numpy_cuda, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_multiplication_numpy_scalar_cuda, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_ndarray_astype_object_graph_break_2_cuda, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_ndarray_astype_object_graph_break_cuda, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_numpy_array_interface_cuda, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_numpy_index_cuda, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_numpy_index_multi_cuda, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_numpy_non_writeable_cuda, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_numpy_scalar_cmp_cuda_bfloat16, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_numpy_scalar_cmp_cuda_bool, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_numpy_scalar_cmp_cuda_complex128, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_numpy_scalar_cmp_cuda_complex64, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_numpy_scalar_cmp_cuda_float16, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_numpy_scalar_cmp_cuda_float32, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_numpy_scalar_cmp_cuda_float64, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_numpy_scalar_cmp_cuda_int16, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_numpy_scalar_cmp_cuda_int32, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_numpy_scalar_cmp_cuda_int64, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_numpy_scalar_cmp_cuda_int8, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_numpy_scalar_cmp_cuda_uint8, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_numpy_unresizable_cuda, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_parse_numpy_int_cuda, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_parse_numpy_int_overflow_cuda, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_to_numpy_bool_cuda, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_to_numpy_cuda, test/test_numpy_interop.py::TestNumPyInteropCUDA::test_to_numpy_force_argument_cuda 2025-08-14T22:45:41.3561965Z 2025-08-14T22:45:45.4950568Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:45:45.4952055Z import pkg_resources 2025-08-14T22:45:45.8751933Z Running profiler/test_record_function 1/1 ... [2025-08-14 22:45:45.874657] 2025-08-14T22:45:45.8752579Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:45:45.8755039Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_record_function.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:45:45.875107] 2025-08-14T22:45:50.5495069Z 2025-08-14T22:45:50.5496408Z profiler/test_record_function 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_record_function_1.1_641f4896a74536a5_.log 2025-08-14T22:45:50.5499591Z Running 4 items in this shard: test/profiler/test_record_function.py::TestRecordFunction::test_datapipe_delegation_with_profiler, test/profiler/test_record_function.py::TestRecordFunction::test_datapipe_with_record_function, test/profiler/test_record_function.py::TestRecordFunction::test_datapipe_with_record_function_fork, test/profiler/test_record_function.py::TestRecordFunction::test_record_function 2025-08-14T22:45:50.5501913Z 2025-08-14T22:45:54.6340576Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:45:54.6341999Z import pkg_resources 2025-08-14T22:45:55.0074815Z Running test_type_hints 1/1 ... [2025-08-14 22:45:55.007021] 2025-08-14T22:45:55.0075343Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:45:55.0077814Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_type_hints.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:45:55.007428] 2025-08-14T22:45:59.6310251Z 2025-08-14T22:45:59.6311313Z test_type_hints 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_type_hints_1.1_109e02351f85f1b1_.log 2025-08-14T22:45:59.6312444Z Running 1 items in this shard: test/test_type_hints.py::TestTypeHints::test_doc_examples 2025-08-14T22:45:59.6312933Z 2025-08-14T22:46:03.6489559Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:46:03.6490901Z import pkg_resources 2025-08-14T22:46:04.0226386Z Running nn/test_lazy_modules 1/1 ... [2025-08-14 22:46:04.022144] 2025-08-14T22:46:04.0226826Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:46:04.0228777Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'nn/test_lazy_modules.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:46:04.022535] 2025-08-14T22:46:08.7464608Z 2025-08-14T22:46:08.7465548Z nn/test_lazy_modules 1/1 was successful, full logs can be found in artifacts with path test/test-reports/nn.test_lazy_modules_1.1_9e32cfe3322535bd_.log 2025-08-14T22:46:08.7481611Z Running 59 items in this shard: test/nn/test_lazy_modules.py::TestLazyModules::test_chained_initialization, test/nn/test_lazy_modules.py::TestLazyModules::test_invalid_functions, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_batchnorm1d, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_batchnorm1d_pickle, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_batchnorm1d_state, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_batchnorm2d, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_batchnorm2d_pickle, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_batchnorm2d_state, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_batchnorm3d, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_batchnorm3d_pickle, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_batchnorm3d_state, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_batchnorm_with_dict_input, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_conv1d, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_conv1d_pickle, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_conv1d_state, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_conv2d, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_conv2d_pickle, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_conv2d_state, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_conv3d, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_conv3d_pickle, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_conv3d_state, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_conv_transpose1d_kwargs, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_conv_transpose1d_pickle, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_conv_transpose1d_state, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_conv_transpose2d, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_conv_transpose2d_kwargs, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_conv_transpose2d_pickle, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_conv_transpose2d_state, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_conv_transpose3d, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_conv_transpose3d_kwargs, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_conv_transpose3d_pickle, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_conv_transpose3d_state, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_conv_transposed1d, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_forward_hook, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_instancenorm1d, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_instancenorm1d_pickle, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_instancenorm1d_state, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_instancenorm2d, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_instancenorm2d_pickle, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_instancenorm2d_state, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_instancenorm3d, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_instancenorm3d_pickle, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_instancenorm3d_state, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_linear_pickle, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_linear_state_and_forward, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_module_buffer, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_module_jit_buffer, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_module_jit_param, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_module_parameter, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_pre_forward_hook, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_share_memory_buffer, test/nn/test_lazy_modules.py::TestLazyModules::test_lazy_share_memory_param, test/nn/test_lazy_modules.py::TestLazyModules::test_linear, test/nn/test_lazy_modules.py::TestLazyModules::test_linear_state, test/nn/test_lazy_modules.py::TestLazyModules::test_materialize_device, test/nn/test_lazy_modules.py::TestLazyModules::test_materialize_dtype, test/nn/test_lazy_modules.py::TestLazyModules::test_optimizer_pass, test/nn/test_lazy_modules.py::TestLazyModules::test_spectral_norm, test/nn/test_lazy_modules.py::TestLazyModules::test_weight_norm 2025-08-14T22:46:08.7497030Z 2025-08-14T22:46:12.8008726Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:46:12.8010049Z import pkg_resources 2025-08-14T22:46:13.1756558Z Running test_segment_reductions 1/1 ... [2025-08-14 22:46:13.175149] 2025-08-14T22:46:13.1757086Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:46:13.1759418Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_segment_reductions.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:46:13.175528] 2025-08-14T22:46:17.8498008Z 2025-08-14T22:46:17.8498821Z test_segment_reductions 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_segment_reductions_1.1_e225824b354b0ebd_.log 2025-08-14T22:46:17.8532971Z Running 74 items in this shard: test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_multi_d_cuda_bfloat16_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_multi_d_cuda_bfloat16_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_multi_d_cuda_float16_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_multi_d_cuda_float16_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_multi_d_cuda_float32_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_multi_d_cuda_float32_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_multi_d_cuda_float64_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_multi_d_cuda_float64_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_multi_d_simple_cuda_bfloat16_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_multi_d_simple_cuda_bfloat16_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_multi_d_simple_cuda_float16_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_multi_d_simple_cuda_float16_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_multi_d_simple_cuda_float32_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_multi_d_simple_cuda_float32_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_multi_d_simple_cuda_float64_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_multi_d_simple_cuda_float64_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_max_cuda_bfloat16_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_max_cuda_bfloat16_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_max_cuda_float16_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_max_cuda_float16_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_max_cuda_float32_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_max_cuda_float32_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_max_cuda_float64_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_max_cuda_float64_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_mean_cuda_bfloat16_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_mean_cuda_bfloat16_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_mean_cuda_float16_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_mean_cuda_float16_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_mean_cuda_float32_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_mean_cuda_float32_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_mean_cuda_float64_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_mean_cuda_float64_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_min_cuda_bfloat16_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_min_cuda_bfloat16_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_min_cuda_float16_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_min_cuda_float16_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_min_cuda_float32_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_min_cuda_float32_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_min_cuda_float64_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_min_cuda_float64_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_prod_cuda_bfloat16_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_prod_cuda_bfloat16_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_prod_cuda_float16_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_prod_cuda_float16_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_prod_cuda_float32_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_prod_cuda_float32_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_prod_cuda_float64_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_prod_cuda_float64_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_sum_cuda_bfloat16_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_sum_cuda_bfloat16_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_sum_cuda_float16_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_sum_cuda_float16_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_sum_cuda_float32_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_sum_cuda_float32_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_sum_cuda_float64_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_pytorch_scatter_test_cases_reduce_sum_cuda_float64_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_simple_1d_cuda_bfloat16_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_simple_1d_cuda_bfloat16_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_simple_1d_cuda_float16_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_simple_1d_cuda_float16_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_simple_1d_cuda_float32_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_simple_1d_cuda_float32_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_simple_1d_cuda_float64_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_simple_1d_cuda_float64_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_simple_zero_length_cuda_bfloat16_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_simple_zero_length_cuda_bfloat16_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_simple_zero_length_cuda_float16_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_simple_zero_length_cuda_float16_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_simple_zero_length_cuda_float32_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_simple_zero_length_cuda_float32_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_simple_zero_length_cuda_float64_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_simple_zero_length_cuda_float64_int64, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_unsafe_flag_cuda_int32, test/test_segment_reductions.py::TestSegmentReductionsCUDA::test_unsafe_flag_cuda_int64 2025-08-14T22:46:17.8563574Z 2025-08-14T22:46:21.9036077Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:46:21.9037487Z import pkg_resources 2025-08-14T22:46:22.2789576Z Running test_import_stats 1/1 ... [2025-08-14 22:46:22.278472] 2025-08-14T22:46:22.2790324Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:46:22.2792068Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_import_stats.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:46:22.278851] 2025-08-14T22:46:26.8523718Z 2025-08-14T22:46:26.8533582Z test_import_stats 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_import_stats_1.1_b5a9242fb437788c_.log 2025-08-14T22:46:26.8535594Z Running 2 items in this shard: test/test_import_stats.py::TestImportTime::test_time_cuda_device_count, test/test_import_stats.py::TestImportTime::test_time_import_torch 2025-08-14T22:46:26.8536255Z 2025-08-14T22:46:30.9000548Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:46:30.9001929Z import pkg_resources 2025-08-14T22:46:31.2745713Z Running test_masked 1/1 ... [2025-08-14 22:46:31.274063] 2025-08-14T22:46:31.2746226Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:46:31.2748076Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_masked.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:46:31.274451] 2025-08-14T22:46:37.2004222Z 2025-08-14T22:46:37.2005264Z test_masked 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_masked_1.1_710b562ab5663766_.log 2025-08-14T22:46:37.2062681Z Running 194 items in this shard: test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_amax_cuda_bfloat16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_amax_cuda_float16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_amax_cuda_float32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_amax_cuda_float64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_amax_cuda_int16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_amax_cuda_int32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_amax_cuda_int64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_amax_cuda_int8, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_amax_cuda_uint8, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_amin_cuda_bfloat16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_amin_cuda_float16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_amin_cuda_float32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_amin_cuda_float64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_amin_cuda_int16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_amin_cuda_int32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_amin_cuda_int64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_amin_cuda_int8, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_amin_cuda_uint8, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_mean_cuda_bfloat16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_mean_cuda_complex128, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_mean_cuda_complex64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_mean_cuda_float16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_mean_cuda_float32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_mean_cuda_float64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_prod_cuda_bfloat16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_prod_cuda_bool, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_prod_cuda_complex128, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_prod_cuda_complex64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_prod_cuda_float16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_prod_cuda_float32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_prod_cuda_float64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_prod_cuda_int16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_prod_cuda_int32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_prod_cuda_int64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_prod_cuda_int8, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_prod_cuda_uint8, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_sum_cuda_bfloat16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_sum_cuda_bool, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_sum_cuda_complex128, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_sum_cuda_complex64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_sum_cuda_float16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_sum_cuda_float32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_sum_cuda_float64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_sum_cuda_int16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_sum_cuda_int32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_sum_cuda_int64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_sum_cuda_int8, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_coo_masked_sum_cuda_uint8, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_amax_cuda_bfloat16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_amax_cuda_float16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_amax_cuda_float32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_amax_cuda_float64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_amax_cuda_int16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_amax_cuda_int32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_amax_cuda_int64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_amax_cuda_int8, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_amax_cuda_uint8, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_amin_cuda_bfloat16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_amin_cuda_float16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_amin_cuda_float32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_amin_cuda_float64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_amin_cuda_int16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_amin_cuda_int32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_amin_cuda_int64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_amin_cuda_int8, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_amin_cuda_uint8, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_mean_cuda_bfloat16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_mean_cuda_complex128, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_mean_cuda_complex64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_mean_cuda_float16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_mean_cuda_float32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_mean_cuda_float64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_prod_cuda_bfloat16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_prod_cuda_bool, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_prod_cuda_complex128, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_prod_cuda_complex64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_prod_cuda_float16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_prod_cuda_float32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_prod_cuda_float64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_prod_cuda_int16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_prod_cuda_int32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_prod_cuda_int64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_prod_cuda_int8, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_prod_cuda_uint8, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_sum_cuda_bfloat16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_sum_cuda_bool, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_sum_cuda_complex128, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_sum_cuda_complex64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_sum_cuda_float16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_sum_cuda_float32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_sum_cuda_float64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_sum_cuda_int16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_sum_cuda_int32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_sum_cuda_int64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_sum_cuda_int8, test/test_masked.py::TestMaskedCUDA::test_mask_layout_sparse_csr_masked_sum_cuda_uint8, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_amax_cuda_bfloat16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_amax_cuda_float16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_amax_cuda_float32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_amax_cuda_float64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_amax_cuda_int16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_amax_cuda_int32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_amax_cuda_int64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_amax_cuda_int8, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_amax_cuda_uint8, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_amin_cuda_bfloat16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_amin_cuda_float16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_amin_cuda_float32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_amin_cuda_float64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_amin_cuda_int16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_amin_cuda_int32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_amin_cuda_int64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_amin_cuda_int8, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_amin_cuda_uint8, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_mean_cuda_bfloat16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_mean_cuda_complex128, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_mean_cuda_complex64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_mean_cuda_float16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_mean_cuda_float32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_mean_cuda_float64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_prod_cuda_bfloat16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_prod_cuda_bool, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_prod_cuda_complex128, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_prod_cuda_complex64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_prod_cuda_float16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_prod_cuda_float32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_prod_cuda_float64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_prod_cuda_int16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_prod_cuda_int32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_prod_cuda_int64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_prod_cuda_int8, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_prod_cuda_uint8, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_sum_cuda_bfloat16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_sum_cuda_bool, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_sum_cuda_complex128, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_sum_cuda_complex64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_sum_cuda_float16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_sum_cuda_float32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_sum_cuda_float64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_sum_cuda_int16, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_sum_cuda_int32, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_sum_cuda_int64, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_sum_cuda_int8, test/test_masked.py::TestMaskedCUDA::test_mask_layout_strided_masked_sum_cuda_uint8, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_log_softmax_cuda_bfloat16, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_log_softmax_cuda_float16, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_log_softmax_cuda_float32, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_log_softmax_cuda_float64, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_norm_cuda_bfloat16, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_norm_cuda_float16, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_norm_cuda_float32, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_norm_cuda_float64, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_normalize_cuda_bfloat16, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_normalize_cuda_complex128, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_normalize_cuda_complex64, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_normalize_cuda_float16, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_normalize_cuda_float32, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_normalize_cuda_float64, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_softmax_cuda_bfloat16, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_softmax_cuda_float16, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_softmax_cuda_float32, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_softmax_cuda_float64, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_softmin_cuda_bfloat16, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_softmin_cuda_float16, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_softmin_cuda_float32, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_softmin_cuda_float64, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_std_cuda_bfloat16, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_std_cuda_complex128, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_std_cuda_complex64, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_std_cuda_float16, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_std_cuda_float32, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_std_cuda_float64, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_std_cuda_int16, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_std_cuda_int32, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_std_cuda_int64, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_std_cuda_int8, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_std_cuda_uint8, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_var_cuda_bfloat16, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_var_cuda_complex128, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_var_cuda_complex64, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_var_cuda_float16, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_var_cuda_float32, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_var_cuda_float64, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_var_cuda_int16, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_var_cuda_int32, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_var_cuda_int64, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_var_cuda_int8, test/test_masked.py::TestMaskedCUDA::test_reference_masked_masked_var_cuda_uint8, test/test_masked.py::TestMaskedCUDA::test_where_coo_fill_value_0_cuda, test/test_masked.py::TestMaskedCUDA::test_where_coo_fill_value_123_cuda, test/test_masked.py::TestMaskedCUDA::test_where_csr_fill_value_0_cuda, test/test_masked.py::TestMaskedCUDA::test_where_csr_fill_value_123_cuda, test/test_masked.py::TestMaskedCUDA::test_where_hybrid_coo_fill_value_0_cuda, test/test_masked.py::TestMaskedCUDA::test_where_hybrid_coo_fill_value_123_cuda 2025-08-14T22:46:37.2118526Z 2025-08-14T22:46:41.2329828Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:46:41.2331448Z import pkg_resources 2025-08-14T22:46:41.6067714Z Running test_cuda_sanitizer 1/1 ... [2025-08-14 22:46:41.606292] 2025-08-14T22:46:41.6068180Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:46:41.6071061Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_cuda_sanitizer.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:46:41.606671] 2025-08-14T22:46:44.9092064Z 2025-08-14T22:46:44.9093123Z test_ops_fwd_gradients 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_fwd_gradients_1.1_6bdac248fc67747e_.log 2025-08-14T22:46:45.0337728Z Running 3192 items in this shard: test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_H_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_H_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_T_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_T_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad___getitem___cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad___getitem___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad___radd___cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad___radd___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad___rdiv___cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad___rdiv___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad___rmatmul___cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad___rmatmul___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad___rmod___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad___rmul___cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad___rmul___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad___rpow___cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad___rpow___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad___rsub___cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad___rsub___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad__batch_norm_with_update_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad__chunk_cat_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad__chunk_cat_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad__native_batch_norm_legit_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad__segment_reduce_lengths_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad__segment_reduce_offsets_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad__softmax_backward_data_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad__unsafe_masked_index_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad__unsafe_masked_index_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad__unsafe_masked_index_put_accumulate_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad__upsample_bilinear2d_aa_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_abs_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_abs_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_acos_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_acos_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_acosh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_acosh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_add_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_add_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_addbmm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_addbmm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_addcdiv_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_addcdiv_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_addcmul_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_addcmul_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_addmm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_addmm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_addmm_decomposed_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_addmm_decomposed_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_addmv_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_addmv_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_addr_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_addr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_alias_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_alias_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_all_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_all_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_allclose_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_allclose_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_amax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_amin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_aminmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_angle_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_angle_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_any_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_any_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_arange_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_argmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_argmin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_argsort_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_argwhere_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_argwhere_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_as_strided_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_as_strided_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_as_strided_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_as_strided_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_as_strided_partial_views_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_as_strided_partial_views_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_as_strided_scatter_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_as_strided_scatter_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_asin_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_asin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_asinh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_asinh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_atan2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_atan_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_atan_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_atanh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_atanh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_atleast_1d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_atleast_1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_atleast_2d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_atleast_2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_atleast_3d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_atleast_3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_baddbmm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_baddbmm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_bernoulli_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_bfloat16_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_bfloat16_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_block_diag_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_block_diag_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_bmm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_bmm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_bool_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_bool_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_broadcast_tensors_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_broadcast_tensors_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_broadcast_to_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_broadcast_to_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_bucketize_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_byte_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_byte_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cartesian_prod_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cartesian_prod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cat_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cat_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cauchy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cdist_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cdouble_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cdouble_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_ceil_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cfloat_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cfloat_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_chalf_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_chalf_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_char_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_char_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cholesky_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cholesky_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cholesky_inverse_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cholesky_inverse_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cholesky_solve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cholesky_solve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_chunk_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_chunk_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_clamp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_clamp_max_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_clamp_min_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_clone_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_clone_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_column_stack_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_column_stack_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_combinations_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_combinations_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_complex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_conj_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_conj_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_conj_physical_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_conj_physical_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_constant_pad_nd_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_constant_pad_nd_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_contiguous_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_contiguous_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_copysign_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_corrcoef_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_corrcoef_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cos_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cos_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cosh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cosh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_count_nonzero_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_count_nonzero_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cov_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cov_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cross_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cross_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cummax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cummin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cumprod_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cumprod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cumsum_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cumsum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cumulative_trapezoid_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_cumulative_trapezoid_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_deg2rad_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_diag_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_diag_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_diag_embed_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_diag_embed_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_diagflat_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_diagflat_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_diagonal_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_diagonal_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_diagonal_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_diagonal_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_diagonal_scatter_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_diagonal_scatter_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_diff_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_diff_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_digamma_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_dist_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_dist_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_div_floor_rounding_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_div_no_rounding_mode_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_div_no_rounding_mode_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_div_trunc_rounding_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_dot_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_dot_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_double_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_double_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_dsplit_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_dsplit_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_dstack_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_dstack_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_einsum_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_einsum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_empty_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_empty_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_empty_like_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_empty_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_empty_permuted_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_empty_permuted_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_empty_strided_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_empty_strided_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_eq_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_eq_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_equal_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_equal_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_erf_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_erfc_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_erfinv_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_exp2_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_exp2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_exp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_exp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_expand_as_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_expand_as_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_expand_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_expand_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_expand_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_expand_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_expm1_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_expm1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_exponential_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_eye_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_eye_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_fft2_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_fft2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_fft_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_fft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_fftn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_fftn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_fftshift_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_fftshift_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_hfft2_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_hfft2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_hfft_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_hfft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_hfftn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_hfftn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_ifft2_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_ifft2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_ifft_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_ifft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_ifftn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_ifftn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_ifftshift_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_ifftshift_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_ihfft2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_ihfft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_ihfftn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_irfft2_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_irfft2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_irfft_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_irfft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_irfftn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_irfftn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_rfft2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_rfft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fft_rfftn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fill_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fill_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_flatten_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_flatten_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_flip_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_flip_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fliplr_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fliplr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_flipud_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_flipud_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_float_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_float_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_float_power_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_float_power_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_floor_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_floor_divide_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fmin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_fmod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_frac_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_frexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_full_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_full_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_full_like_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_full_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_gather_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_gather_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_ge_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_geometric_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_geqrf_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_geqrf_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_gradient_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_gradient_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_grid_sampler_2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_gt_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_half_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_half_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_hash_tensor_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_heaviside_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_histc_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_hsplit_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_hsplit_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_hstack_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_hstack_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_hypot_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_i0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_igamma_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_igammac_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_imag_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_index_add_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_index_add_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_index_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_index_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_index_fill_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_index_fill_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_index_put_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_index_put_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_index_reduce_amax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_index_reduce_amin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_index_reduce_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_index_reduce_prod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_index_select_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_index_select_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_inner_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_inner_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_int_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_int_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_isclose_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_isclose_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_isfinite_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_isfinite_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_isin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_isinf_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_isinf_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_isnan_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_isnan_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_isneginf_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_isposinf_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_isreal_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_isreal_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_istft_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_item_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_item_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_jiterator_2inputs_2outputs_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_jiterator_2inputs_2outputs_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_jiterator_4inputs_with_extra_args_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_jiterator_binary_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_jiterator_binary_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_jiterator_binary_return_by_ref_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_jiterator_binary_return_by_ref_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_jiterator_unary_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_jiterator_unary_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_kron_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_kron_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_kthvalue_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_ldexp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_ldexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_le_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_lerp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_lerp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_lgamma_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_cholesky_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_cholesky_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_cholesky_ex_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_cholesky_ex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_cond_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_cond_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_cross_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_cross_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_det_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_det_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_diagonal_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_diagonal_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_eig_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_eig_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_eigh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_eigh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_eigvals_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_eigvals_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_eigvalsh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_eigvalsh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_householder_product_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_householder_product_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_inv_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_inv_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_inv_ex_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_inv_ex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_ldl_factor_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_ldl_factor_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_ldl_factor_ex_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_ldl_factor_ex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_ldl_solve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_ldl_solve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_lstsq_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_lstsq_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_lstsq_grad_oriented_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_lstsq_grad_oriented_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_lu_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_lu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_lu_factor_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_lu_factor_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_lu_factor_ex_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_lu_factor_ex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_lu_solve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_lu_solve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_matrix_norm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_matrix_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_matrix_power_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_matrix_power_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_matrix_rank_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_matrix_rank_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_matrix_rank_hermitian_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_matrix_rank_hermitian_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_multi_dot_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_multi_dot_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_norm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_norm_subgradients_at_zero_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_pinv_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_pinv_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_pinv_hermitian_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_pinv_hermitian_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_pinv_singular_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_pinv_singular_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_qr_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_qr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_slogdet_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_slogdet_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_solve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_solve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_solve_ex_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_solve_ex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_solve_triangular_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_solve_triangular_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_svd_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_svd_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_svdvals_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_svdvals_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_tensorinv_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_tensorinv_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_tensorsolve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_tensorsolve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_vander_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_vander_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_vecdot_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_vecdot_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_vector_norm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linalg_vector_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linspace_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linspace_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linspace_tensor_overload_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_linspace_tensor_overload_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_log10_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_log10_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_log1p_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_log1p_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_log2_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_log2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_log_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_log_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_log_normal_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_log_softmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_log_softmax_with_dtype_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_log_softmax_with_dtype_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logaddexp2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logaddexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logcumsumexp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logcumsumexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logdet_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logdet_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logical_and_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logical_and_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logical_not_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logical_not_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logical_or_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logical_or_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logical_xor_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logical_xor_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logit_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logspace_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logspace_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logspace_tensor_overload_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logspace_tensor_overload_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logsumexp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_logsumexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_long_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_long_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_lt_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_lu_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_lu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_lu_solve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_lu_solve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_lu_unpack_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_lu_unpack_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_mH_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_mH_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_mT_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_mT_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_amax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_amin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_argmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_argmin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_cumprod_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_cumprod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_cumsum_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_cumsum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_fill_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_fill_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_log_softmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_logaddexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_logsumexp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_logsumexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_mean_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_median_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_normalize_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_normalize_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_prod_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_prod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_scatter_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_scatter_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_select_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_select_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_softmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_softmin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_std_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_std_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_sum_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_sum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_var_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_masked_var_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_matmul_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_matmul_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_matrix_exp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_matrix_exp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_max_binary_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_max_pool2d_with_indices_backward_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_max_reduction_no_dim_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_max_reduction_with_dim_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_maximum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_mean_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_median_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_meshgrid_list_of_tensors_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_meshgrid_list_of_tensors_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_meshgrid_variadic_tensors_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_meshgrid_variadic_tensors_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_min_binary_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_min_reduction_no_dim_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_min_reduction_with_dim_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_minimum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_mm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_mm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_mode_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_movedim_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_movedim_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_msort_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_mul_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_mul_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_multinomial_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_mv_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_mv_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nan_to_num_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nanmean_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nanmean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nanmedian_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nanquantile_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nansum_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nansum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_narrow_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_narrow_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_narrow_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_narrow_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_native_batch_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_native_dropout_backward_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_native_layer_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_ne_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_ne_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_neg_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_neg_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_new_empty_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_new_empty_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_new_empty_strided_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_new_empty_strided_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_new_full_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_new_full_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_new_ones_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_new_ones_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_new_zeros_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_new_zeros_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nextafter_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_alpha_dropout_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_avg_pool1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_avg_pool2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_avg_pool3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_batch_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_bilinear_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_binary_cross_entropy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_celu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_channel_shuffle_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_channel_shuffle_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_conv1d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_conv1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_conv2d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_conv2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_conv3d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_conv3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_conv_transpose1d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_conv_transpose1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_conv_transpose2d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_conv_transpose2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_conv_transpose3d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_conv_transpose3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_cosine_embedding_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_cosine_similarity_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_cross_entropy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_ctc_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_dropout2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_dropout3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_dropout_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_elu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_embedding_bag_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_embedding_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_fractional_max_pool2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_fractional_max_pool3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_gaussian_nll_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_gelu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_glu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_grid_sample_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_group_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_hardshrink_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_hardsigmoid_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_hardswish_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_hardtanh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_huber_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_instance_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_interpolate_area_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_interpolate_bicubic_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_interpolate_bilinear_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_interpolate_linear_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_interpolate_nearest_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_interpolate_trilinear_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_kl_div_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_l1_loss_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_l1_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_layer_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_leaky_relu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_linear_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_linear_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_local_response_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_logsigmoid_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_max_pool1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_max_pool2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_max_pool3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_max_unpool1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_max_unpool1d_grad_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_max_unpool2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_max_unpool2d_grad_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_max_unpool3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_max_unpool3d_grad_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_mish_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_mse_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_multi_head_attention_forward_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_multi_margin_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_multilabel_margin_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_nll_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_normalize_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_normalize_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_pad_circular_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_pad_circular_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_pad_constant_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_pad_constant_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_pad_reflect_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_pad_reflect_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_pad_replicate_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_pad_replicate_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_pad_replicate_negative_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_pad_replicate_negative_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_pairwise_distance_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_pairwise_distance_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_pdist_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_pixel_shuffle_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_pixel_unshuffle_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_prelu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_relu6_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_relu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_rms_norm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_rms_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_rrelu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_selu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_silu_complex_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_silu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_soft_margin_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_softmin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_softmin_with_dtype_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_softplus_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_softshrink_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_softsign_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_softsign_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_tanhshrink_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_tanhshrink_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_threshold_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_unfold_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_unfold_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_upsample_bilinear_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nn_functional_upsample_nearest_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nonzero_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nonzero_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nonzero_static_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_nonzero_static_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_norm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_norm_fro_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_norm_fro_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_norm_inf_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_norm_inf_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_norm_nuc_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_norm_nuc_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_normal_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_normal_in_place_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_normal_in_place_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_normal_number_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_ones_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_ones_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_ones_like_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_ones_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_ormqr_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_ormqr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_outer_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_outer_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_pca_lowrank_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_pca_lowrank_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_permute_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_permute_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_permute_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_permute_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_pinverse_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_pinverse_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_polar_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_polygamma_polygamma_n_0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_polygamma_polygamma_n_1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_polygamma_polygamma_n_2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_polygamma_polygamma_n_3_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_polygamma_polygamma_n_4_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_positive_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_positive_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_pow_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_pow_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_prod_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_prod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_put_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_put_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_qr_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_qr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_quantile_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_rad2deg_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_rand_like_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_rand_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_randint_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_randint_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_randn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_randn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_randn_like_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_randn_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_ravel_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_ravel_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_real_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_real_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_reciprocal_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_reciprocal_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_remainder_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_renorm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_renorm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_repeat_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_repeat_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_repeat_interleave_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_repeat_interleave_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_reshape_as_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_reshape_as_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_reshape_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_reshape_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_resize__cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_resize__cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_resize_as__cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_resize_as__cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_resolve_conj_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_resolve_conj_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_resolve_neg_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_resolve_neg_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_roll_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_roll_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_rot90_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_rot90_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_round_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_round_decimals_0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_round_decimals_3_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_round_decimals_neg_3_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_rsqrt_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_rsqrt_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_rsub_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_rsub_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_scalar_tensor_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_scalar_tensor_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_scatter_add_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_scatter_add_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_scatter_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_scatter_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_scatter_reduce_amax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_scatter_reduce_amin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_scatter_reduce_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_scatter_reduce_prod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_scatter_reduce_sum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_searchsorted_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_select_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_select_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_select_scatter_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sgn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sgn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_short_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_short_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sigmoid_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sigmoid_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sign_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_signal_windows_bartlett_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_signal_windows_blackman_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_signal_windows_cosine_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_signal_windows_exponential_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_signal_windows_gaussian_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_signal_windows_general_cosine_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_signal_windows_general_hamming_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_signal_windows_hamming_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_signal_windows_hann_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_signal_windows_kaiser_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_signal_windows_nuttall_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_signbit_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sin_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sinc_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sinc_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sinh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sinh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_slice_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_slice_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_slice_scatter_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_softmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_softmax_with_dtype_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_softmax_with_dtype_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sort_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sparse_mm_reduce_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sparse_sampled_addmm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sparse_sampled_addmm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_airy_ai_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_bessel_j0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_bessel_j1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_bessel_y0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_bessel_y1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_chebyshev_polynomial_t_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_chebyshev_polynomial_u_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_chebyshev_polynomial_v_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_chebyshev_polynomial_w_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_entr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_erfcx_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_hermite_polynomial_h_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_hermite_polynomial_he_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_i0e_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_i1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_i1e_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_laguerre_polynomial_l_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_legendre_polynomial_p_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_log_ndtr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_modified_bessel_i0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_modified_bessel_i1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_modified_bessel_k0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_modified_bessel_k1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_ndtr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_ndtri_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_scaled_modified_bessel_k0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_scaled_modified_bessel_k1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_spherical_bessel_j0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_xlog1py_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_special_zeta_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_split_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_split_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_split_list_args_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_split_list_args_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_split_with_sizes_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_split_with_sizes_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_split_with_sizes_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_split_with_sizes_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sqrt_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sqrt_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_square_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_square_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_squeeze_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_squeeze_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_squeeze_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_squeeze_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_squeeze_multiple_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_squeeze_multiple_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_stack_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_stack_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_std_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_std_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_std_mean_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_std_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_std_mean_unbiased_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_std_mean_unbiased_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_std_unbiased_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_std_unbiased_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_stft_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_stft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sub_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sub_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sum_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sum_to_size_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_sum_to_size_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_svd_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_svd_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_svd_lowrank_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_svd_lowrank_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_t_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_t_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_t_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_t_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_take_along_dim_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_take_along_dim_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_take_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_take_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_tan_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_tan_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_tanh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_tanh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_tensor_split_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_tensor_split_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_tensordot_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_tensordot_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_tile_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_tile_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_to_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_to_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_to_sparse_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_to_sparse_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_topk_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_trace_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_trace_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_transpose_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_transpose_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_transpose_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_transpose_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_trapezoid_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_trapezoid_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_trapz_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_trapz_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_triangular_solve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_triangular_solve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_tril_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_tril_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_triu_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_triu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_true_divide_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_true_divide_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_trunc_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unbind_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unbind_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unbind_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unbind_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unflatten_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unflatten_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unfold_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unfold_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unfold_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unfold_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_uniform_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_uniform_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unique_consecutive_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unique_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unsafe_chunk_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unsafe_chunk_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unsafe_split_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unsafe_split_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unsqueeze_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unsqueeze_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unsqueeze_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_unsqueeze_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_var_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_var_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_var_mean_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_var_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_var_mean_unbiased_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_var_mean_unbiased_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_var_unbiased_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_var_unbiased_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_vdot_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_vdot_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_view_as_complex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_view_as_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_view_as_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_view_as_real_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_view_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_view_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_view_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_view_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_vsplit_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_vsplit_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_vstack_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_vstack_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_where_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_where_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_xlogy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_zero__cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_zero__cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_zeros_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_zeros_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_zeros_like_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_fn_fwgrad_bwgrad_zeros_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_H_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_H_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_T_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_T_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD___getitem___cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD___getitem___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD___radd___cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD___radd___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD___rdiv___cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD___rdiv___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD___rmatmul___cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD___rmatmul___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD___rmod___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD___rmul___cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD___rmul___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD___rpow___cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD___rpow___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD___rsub___cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD___rsub___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD__batch_norm_with_update_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD__chunk_cat_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD__chunk_cat_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD__native_batch_norm_legit_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD__segment_reduce_lengths_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD__segment_reduce_offsets_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD__softmax_backward_data_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD__unsafe_masked_index_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD__unsafe_masked_index_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD__unsafe_masked_index_put_accumulate_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD__upsample_bilinear2d_aa_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_abs_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_abs_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_acos_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_acos_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_acosh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_acosh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_add_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_add_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_addbmm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_addbmm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_addcdiv_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_addcdiv_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_addcmul_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_addcmul_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_addmm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_addmm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_addmm_decomposed_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_addmm_decomposed_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_addmv_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_addmv_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_addr_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_addr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_alias_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_alias_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_all_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_all_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_allclose_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_allclose_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_amax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_amin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_aminmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_angle_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_angle_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_any_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_any_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_arange_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_argmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_argmin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_argsort_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_argwhere_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_argwhere_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_as_strided_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_as_strided_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_as_strided_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_as_strided_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_as_strided_partial_views_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_as_strided_partial_views_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_as_strided_scatter_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_as_strided_scatter_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_asin_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_asin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_asinh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_asinh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_atan2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_atan_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_atan_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_atanh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_atanh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_atleast_1d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_atleast_1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_atleast_2d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_atleast_2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_atleast_3d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_atleast_3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_baddbmm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_baddbmm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_bernoulli_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_bfloat16_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_bfloat16_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_block_diag_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_block_diag_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_bmm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_bmm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_bool_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_bool_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_broadcast_tensors_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_broadcast_tensors_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_broadcast_to_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_broadcast_to_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_bucketize_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_byte_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_byte_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cartesian_prod_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cartesian_prod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cat_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cat_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cauchy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cdist_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cdouble_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cdouble_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_ceil_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cfloat_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cfloat_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_chalf_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_chalf_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_char_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_char_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cholesky_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cholesky_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cholesky_inverse_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cholesky_inverse_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cholesky_solve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cholesky_solve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_chunk_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_chunk_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_clamp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_clamp_max_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_clamp_min_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_clone_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_clone_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_column_stack_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_column_stack_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_combinations_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_combinations_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_complex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_conj_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_conj_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_conj_physical_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_conj_physical_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_constant_pad_nd_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_constant_pad_nd_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_contiguous_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_contiguous_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_copysign_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_corrcoef_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_corrcoef_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cos_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cos_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cosh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cosh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_count_nonzero_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_count_nonzero_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cov_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cov_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cross_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cross_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cummax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cummin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cumprod_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cumprod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cumsum_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cumsum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cumulative_trapezoid_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_cumulative_trapezoid_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_deg2rad_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_diag_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_diag_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_diag_embed_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_diag_embed_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_diagflat_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_diagflat_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_diagonal_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_diagonal_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_diagonal_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_diagonal_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_diagonal_scatter_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_diagonal_scatter_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_diff_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_diff_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_digamma_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_dist_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_dist_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_div_floor_rounding_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_div_no_rounding_mode_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_div_no_rounding_mode_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_div_trunc_rounding_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_dot_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_dot_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_double_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_double_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_dsplit_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_dsplit_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_dstack_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_dstack_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_einsum_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_einsum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_empty_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_empty_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_empty_like_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_empty_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_empty_permuted_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_empty_permuted_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_empty_strided_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_empty_strided_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_eq_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_eq_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_equal_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_equal_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_erf_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_erfc_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_erfinv_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_exp2_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_exp2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_exp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_exp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_expand_as_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_expand_as_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_expand_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_expand_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_expand_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_expand_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_expm1_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_expm1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_exponential_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_eye_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_eye_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_fft2_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_fft2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_fft_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_fft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_fftn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_fftn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_fftshift_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_fftshift_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_hfft2_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_hfft2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_hfft_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_hfft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_hfftn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_hfftn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_ifft2_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_ifft2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_ifft_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_ifft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_ifftn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_ifftn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_ifftshift_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_ifftshift_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_ihfft2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_ihfft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_ihfftn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_irfft2_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_irfft2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_irfft_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_irfft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_irfftn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_irfftn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_rfft2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_rfft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fft_rfftn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fill_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fill_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_flatten_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_flatten_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_flip_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_flip_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fliplr_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fliplr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_flipud_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_flipud_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_float_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_float_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_float_power_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_float_power_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_floor_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_floor_divide_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fmin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_fmod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_frac_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_frexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_full_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_full_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_full_like_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_full_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_gather_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_gather_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_ge_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_geometric_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_geqrf_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_geqrf_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_gradient_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_gradient_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_grid_sampler_2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_gt_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_half_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_half_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_hash_tensor_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_heaviside_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_histc_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_hsplit_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_hsplit_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_hstack_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_hstack_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_hypot_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_i0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_igamma_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_igammac_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_imag_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_index_add_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_index_add_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_index_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_index_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_index_fill_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_index_fill_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_index_put_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_index_put_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_index_reduce_amax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_index_reduce_amin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_index_reduce_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_index_reduce_prod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_index_select_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_index_select_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_inner_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_inner_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_int_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_int_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_isclose_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_isclose_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_isfinite_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_isfinite_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_isin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_isinf_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_isinf_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_isnan_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_isnan_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_isneginf_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_isposinf_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_isreal_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_isreal_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_istft_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_item_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_item_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_jiterator_2inputs_2outputs_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_jiterator_2inputs_2outputs_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_jiterator_4inputs_with_extra_args_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_jiterator_binary_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_jiterator_binary_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_jiterator_binary_return_by_ref_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_jiterator_binary_return_by_ref_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_jiterator_unary_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_jiterator_unary_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_kron_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_kron_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_kthvalue_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_ldexp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_ldexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_le_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_lerp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_lerp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_lgamma_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_cholesky_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_cholesky_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_cholesky_ex_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_cholesky_ex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_cond_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_cond_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_cross_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_cross_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_det_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_det_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_diagonal_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_diagonal_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_eig_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_eig_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_eigh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_eigh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_eigvals_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_eigvals_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_eigvalsh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_eigvalsh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_householder_product_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_householder_product_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_inv_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_inv_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_inv_ex_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_inv_ex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_ldl_factor_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_ldl_factor_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_ldl_factor_ex_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_ldl_factor_ex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_ldl_solve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_ldl_solve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_lstsq_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_lstsq_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_lstsq_grad_oriented_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_lstsq_grad_oriented_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_lu_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_lu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_lu_factor_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_lu_factor_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_lu_factor_ex_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_lu_factor_ex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_lu_solve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_lu_solve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_matrix_norm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_matrix_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_matrix_power_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_matrix_power_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_matrix_rank_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_matrix_rank_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_matrix_rank_hermitian_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_matrix_rank_hermitian_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_multi_dot_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_multi_dot_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_norm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_norm_subgradients_at_zero_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_pinv_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_pinv_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_pinv_hermitian_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_pinv_hermitian_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_pinv_singular_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_pinv_singular_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_qr_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_qr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_slogdet_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_slogdet_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_solve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_solve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_solve_ex_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_solve_ex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_solve_triangular_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_solve_triangular_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_svd_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_svd_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_svdvals_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_svdvals_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_tensorinv_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_tensorinv_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_tensorsolve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_tensorsolve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_vander_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_vander_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_vecdot_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_vecdot_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_vector_norm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_vector_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linspace_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linspace_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linspace_tensor_overload_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linspace_tensor_overload_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_log10_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_log10_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_log1p_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_log1p_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_log2_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_log2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_log_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_log_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_log_normal_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_log_softmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_log_softmax_with_dtype_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_log_softmax_with_dtype_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logaddexp2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logaddexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logcumsumexp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logcumsumexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logdet_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logdet_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logical_and_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logical_and_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logical_not_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logical_not_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logical_or_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logical_or_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logical_xor_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logical_xor_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logit_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logspace_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logspace_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logspace_tensor_overload_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logspace_tensor_overload_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logsumexp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_logsumexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_long_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_long_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_lt_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_lu_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_lu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_lu_solve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_lu_solve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_lu_unpack_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_lu_unpack_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_mH_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_mH_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_mT_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_mT_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_amax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_amin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_argmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_argmin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_cumprod_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_cumprod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_cumsum_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_cumsum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_fill_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_fill_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_log_softmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_logaddexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_logsumexp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_logsumexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_mean_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_median_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_normalize_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_normalize_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_prod_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_prod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_scatter_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_scatter_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_select_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_select_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_softmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_softmin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_std_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_std_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_sum_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_sum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_var_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_masked_var_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_matmul_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_matmul_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_matrix_exp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_matrix_exp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_max_binary_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_max_pool2d_with_indices_backward_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_max_reduction_no_dim_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_max_reduction_with_dim_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_maximum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_mean_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_median_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_meshgrid_list_of_tensors_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_meshgrid_list_of_tensors_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_meshgrid_variadic_tensors_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_meshgrid_variadic_tensors_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_min_binary_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_min_reduction_no_dim_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_min_reduction_with_dim_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_minimum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_mm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_mm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_mode_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_movedim_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_movedim_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_msort_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_mul_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_mul_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_multinomial_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_mv_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_mv_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nan_to_num_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nanmean_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nanmean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nanmedian_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nanquantile_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nansum_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nansum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_narrow_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_narrow_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_narrow_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_narrow_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_native_batch_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_native_dropout_backward_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_native_layer_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_ne_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_ne_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_neg_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_neg_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_new_empty_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_new_empty_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_new_empty_strided_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_new_empty_strided_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_new_full_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_new_full_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_new_ones_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_new_ones_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_new_zeros_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_new_zeros_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nextafter_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_alpha_dropout_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_avg_pool1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_avg_pool2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_avg_pool3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_batch_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_bilinear_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_binary_cross_entropy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_celu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_channel_shuffle_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_channel_shuffle_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_conv1d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_conv1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_conv2d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_conv2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_conv3d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_conv3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_conv_transpose1d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_conv_transpose1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_conv_transpose2d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_conv_transpose2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_conv_transpose3d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_conv_transpose3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_cosine_embedding_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_cosine_similarity_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_cross_entropy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_ctc_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_dropout2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_dropout3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_dropout_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_elu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_embedding_bag_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_embedding_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_fractional_max_pool2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_fractional_max_pool3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_gaussian_nll_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_gelu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_glu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_grid_sample_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_group_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_hardshrink_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_hardsigmoid_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_hardswish_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_hardtanh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_huber_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_instance_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_interpolate_area_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_interpolate_bicubic_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_interpolate_bilinear_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_interpolate_linear_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_interpolate_nearest_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_interpolate_trilinear_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_kl_div_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_l1_loss_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_l1_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_layer_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_leaky_relu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_linear_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_linear_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_local_response_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_logsigmoid_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_max_pool1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_max_pool2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_max_pool3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_max_unpool1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_max_unpool1d_grad_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_max_unpool2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_max_unpool2d_grad_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_max_unpool3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_max_unpool3d_grad_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_mish_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_mse_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_multi_head_attention_forward_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_multi_margin_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_multilabel_margin_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_nll_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_normalize_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_normalize_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_pad_circular_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_pad_circular_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_pad_constant_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_pad_constant_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_pad_reflect_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_pad_reflect_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_pad_replicate_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_pad_replicate_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_pad_replicate_negative_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_pad_replicate_negative_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_pairwise_distance_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_pairwise_distance_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_pdist_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_pixel_shuffle_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_pixel_unshuffle_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_prelu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_relu6_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_relu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_rms_norm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_rms_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_rrelu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_selu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_silu_complex_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_silu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_soft_margin_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_softmin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_softmin_with_dtype_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_softplus_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_softshrink_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_softsign_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_softsign_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_tanhshrink_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_tanhshrink_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_threshold_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_unfold_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_unfold_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_upsample_bilinear_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nn_functional_upsample_nearest_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nonzero_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nonzero_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nonzero_static_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_nonzero_static_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_norm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_norm_fro_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_norm_fro_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_norm_inf_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_norm_inf_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_norm_nuc_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_norm_nuc_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_normal_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_normal_in_place_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_normal_in_place_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_normal_number_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_ones_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_ones_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_ones_like_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_ones_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_ormqr_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_ormqr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_outer_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_outer_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_pca_lowrank_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_pca_lowrank_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_permute_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_permute_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_permute_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_permute_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_pinverse_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_pinverse_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_polar_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_polygamma_polygamma_n_0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_polygamma_polygamma_n_1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_polygamma_polygamma_n_2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_polygamma_polygamma_n_3_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_polygamma_polygamma_n_4_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_positive_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_positive_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_pow_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_pow_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_prod_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_prod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_put_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_put_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_qr_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_qr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_quantile_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_rad2deg_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_rand_like_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_rand_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_randint_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_randint_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_randn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_randn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_randn_like_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_randn_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_ravel_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_ravel_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_real_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_real_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_reciprocal_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_reciprocal_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_remainder_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_renorm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_renorm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_repeat_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_repeat_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_repeat_interleave_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_repeat_interleave_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_reshape_as_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_reshape_as_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_reshape_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_reshape_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_resize__cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_resize__cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_resize_as__cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_resize_as__cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_resolve_conj_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_resolve_conj_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_resolve_neg_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_resolve_neg_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_roll_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_roll_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_rot90_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_rot90_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_round_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_round_decimals_0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_round_decimals_3_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_round_decimals_neg_3_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_rsqrt_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_rsqrt_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_rsub_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_rsub_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_scalar_tensor_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_scalar_tensor_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_scatter_add_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_scatter_add_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_scatter_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_scatter_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_scatter_reduce_amax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_scatter_reduce_amin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_scatter_reduce_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_scatter_reduce_prod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_scatter_reduce_sum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_searchsorted_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_select_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_select_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_select_scatter_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sgn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sgn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_short_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_short_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sigmoid_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sigmoid_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sign_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_signal_windows_bartlett_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_signal_windows_blackman_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_signal_windows_cosine_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_signal_windows_exponential_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_signal_windows_gaussian_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_signal_windows_general_cosine_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_signal_windows_general_hamming_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_signal_windows_hamming_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_signal_windows_hann_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_signal_windows_kaiser_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_signal_windows_nuttall_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_signbit_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sin_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sinc_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sinc_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sinh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sinh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_slice_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_slice_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_slice_scatter_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_softmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_softmax_with_dtype_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_softmax_with_dtype_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sort_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sparse_mm_reduce_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sparse_sampled_addmm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sparse_sampled_addmm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_airy_ai_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_bessel_j0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_bessel_j1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_bessel_y0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_bessel_y1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_chebyshev_polynomial_t_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_chebyshev_polynomial_u_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_chebyshev_polynomial_v_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_chebyshev_polynomial_w_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_entr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_erfcx_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_hermite_polynomial_h_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_hermite_polynomial_he_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_i0e_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_i1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_i1e_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_laguerre_polynomial_l_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_legendre_polynomial_p_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_log_ndtr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_modified_bessel_i0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_modified_bessel_i1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_modified_bessel_k0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_modified_bessel_k1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_ndtr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_ndtri_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_scaled_modified_bessel_k0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_scaled_modified_bessel_k1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_spherical_bessel_j0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_xlog1py_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_special_zeta_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_split_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_split_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_split_list_args_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_split_list_args_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_split_with_sizes_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_split_with_sizes_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_split_with_sizes_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_split_with_sizes_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sqrt_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sqrt_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_square_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_square_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_squeeze_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_squeeze_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_squeeze_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_squeeze_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_squeeze_multiple_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_squeeze_multiple_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_stack_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_stack_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_std_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_std_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_std_mean_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_std_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_std_mean_unbiased_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_std_mean_unbiased_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_std_unbiased_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_std_unbiased_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_stft_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_stft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sub_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sub_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sum_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sum_to_size_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_sum_to_size_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_svd_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_svd_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_svd_lowrank_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_svd_lowrank_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_t_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_t_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_t_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_t_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_take_along_dim_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_take_along_dim_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_take_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_take_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_tan_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_tan_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_tanh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_tanh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_tensor_split_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_tensor_split_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_tensordot_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_tensordot_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_tile_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_tile_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_to_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_to_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_to_sparse_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_to_sparse_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_topk_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_trace_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_trace_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_transpose_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_transpose_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_transpose_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_transpose_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_trapezoid_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_trapezoid_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_trapz_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_trapz_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_triangular_solve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_triangular_solve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_tril_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_tril_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_triu_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_triu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_true_divide_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_true_divide_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_trunc_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_unbind_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_unbind_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_unbind_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_unbind_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_unflatten_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_unflatten_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_unfold_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_unfold_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_unfold_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_unfold_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_uniform_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_uniform_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_unique_consecutive_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_unique_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_unsafe_chunk_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_unsafe_chunk_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_unsafe_split_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_unsafe_split_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_unsqueeze_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_unsqueeze_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_unsqueeze_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_unsqueeze_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_var_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_var_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_var_mean_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_var_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_var_mean_unbiased_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_var_mean_unbiased_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_var_unbiased_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_var_unbiased_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_vdot_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_vdot_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_view_as_complex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_view_as_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_view_as_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_view_as_real_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_view_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_view_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_view_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_view_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_vsplit_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_vsplit_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_vstack_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_vstack_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_where_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_where_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_xlogy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_zero__cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_zero__cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_zeros_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_zeros_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_zeros_like_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_zeros_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_H_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_H_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_T_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_T_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD___getitem___cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD___getitem___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD___radd___cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD___radd___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD___rdiv___cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD___rdiv___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD___rmatmul___cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD___rmatmul___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD___rmod___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD___rmul___cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD___rmul___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD___rpow___cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD___rpow___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD___rsub___cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD___rsub___cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD__batch_norm_with_update_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD__chunk_cat_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD__chunk_cat_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD__native_batch_norm_legit_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD__segment_reduce_lengths_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD__segment_reduce_offsets_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD__softmax_backward_data_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD__unsafe_masked_index_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD__unsafe_masked_index_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD__unsafe_masked_index_put_accumulate_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD__upsample_bilinear2d_aa_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_abs_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_abs_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_acos_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_acos_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_acosh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_acosh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_add_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_add_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_addbmm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_addbmm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_addcdiv_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_addcdiv_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_addcmul_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_addcmul_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_addmm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_addmm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_addmm_decomposed_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_addmm_decomposed_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_addmv_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_addmv_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_addr_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_addr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_alias_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_alias_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_all_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_all_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_allclose_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_allclose_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_amax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_amin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_aminmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_angle_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_angle_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_any_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_any_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_arange_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_argmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_argmin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_argsort_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_argwhere_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_argwhere_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_as_strided_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_as_strided_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_as_strided_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_as_strided_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_as_strided_partial_views_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_as_strided_partial_views_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_as_strided_scatter_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_as_strided_scatter_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_asin_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_asin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_asinh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_asinh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_atan2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_atan_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_atan_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_atanh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_atanh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_atleast_1d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_atleast_1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_atleast_2d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_atleast_2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_atleast_3d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_atleast_3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_baddbmm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_baddbmm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_bernoulli_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_bfloat16_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_bfloat16_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_block_diag_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_block_diag_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_bmm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_bmm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_bool_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_bool_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_broadcast_tensors_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_broadcast_tensors_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_broadcast_to_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_broadcast_to_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_bucketize_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_byte_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_byte_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cartesian_prod_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cartesian_prod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cat_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cat_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cauchy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cdist_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cdouble_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cdouble_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_ceil_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cfloat_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cfloat_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_chalf_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_chalf_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_char_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_char_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cholesky_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cholesky_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cholesky_inverse_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cholesky_inverse_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cholesky_solve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cholesky_solve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_chunk_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_chunk_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_clamp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_clamp_max_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_clamp_min_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_clone_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_clone_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_column_stack_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_column_stack_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_combinations_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_combinations_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_complex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_conj_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_conj_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_conj_physical_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_conj_physical_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_constant_pad_nd_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_constant_pad_nd_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_contiguous_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_contiguous_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_copysign_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_corrcoef_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_corrcoef_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cos_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cos_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cosh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cosh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_count_nonzero_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_count_nonzero_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cov_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cov_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cross_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cross_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cummax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cummin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cumprod_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cumprod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cumsum_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cumsum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cumulative_trapezoid_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_cumulative_trapezoid_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_deg2rad_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_diag_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_diag_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_diag_embed_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_diag_embed_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_diagflat_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_diagflat_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_diagonal_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_diagonal_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_diagonal_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_diagonal_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_diagonal_scatter_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_diagonal_scatter_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_diff_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_diff_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_digamma_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_dist_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_dist_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_div_floor_rounding_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_div_no_rounding_mode_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_div_no_rounding_mode_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_div_trunc_rounding_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_dot_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_dot_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_double_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_double_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_dsplit_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_dsplit_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_dstack_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_dstack_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_einsum_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_einsum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_empty_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_empty_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_empty_like_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_empty_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_empty_permuted_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_empty_permuted_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_empty_strided_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_empty_strided_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_eq_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_eq_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_equal_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_equal_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_erf_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_erfc_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_erfinv_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_exp2_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_exp2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_exp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_exp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_expand_as_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_expand_as_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_expand_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_expand_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_expand_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_expand_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_expm1_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_expm1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_exponential_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_eye_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_eye_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_fft2_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_fft2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_fft_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_fft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_fftn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_fftn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_fftshift_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_fftshift_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_hfft2_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_hfft2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_hfft_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_hfft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_hfftn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_hfftn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_ifft2_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_ifft2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_ifft_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_ifft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_ifftn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_ifftn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_ifftshift_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_ifftshift_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_ihfft2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_ihfft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_ihfftn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_irfft2_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_irfft2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_irfft_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_irfft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_irfftn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_irfftn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_rfft2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_rfft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fft_rfftn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fill_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fill_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_flatten_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_flatten_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_flip_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_flip_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fliplr_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fliplr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_flipud_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_flipud_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_float_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_float_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_float_power_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_float_power_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_floor_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_floor_divide_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fmin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_fmod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_frac_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_frexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_full_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_full_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_full_like_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_full_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_gather_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_gather_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_ge_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_geometric_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_geqrf_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_geqrf_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_gradient_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_gradient_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_grid_sampler_2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_gt_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_half_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_half_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_hash_tensor_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_heaviside_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_histc_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_hsplit_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_hsplit_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_hstack_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_hstack_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_hypot_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_i0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_igamma_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_igammac_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_imag_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_index_add_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_index_add_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_index_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_index_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_index_fill_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_index_fill_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_index_put_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_index_put_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_index_reduce_amax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_index_reduce_amin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_index_reduce_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_index_reduce_prod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_index_select_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_index_select_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_inner_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_inner_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_int_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_int_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_isclose_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_isclose_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_isfinite_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_isfinite_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_isin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_isinf_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_isinf_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_isnan_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_isnan_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_isneginf_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_isposinf_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_isreal_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_isreal_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_istft_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_item_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_item_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_jiterator_2inputs_2outputs_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_jiterator_2inputs_2outputs_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_jiterator_4inputs_with_extra_args_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_jiterator_binary_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_jiterator_binary_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_jiterator_binary_return_by_ref_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_jiterator_binary_return_by_ref_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_jiterator_unary_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_jiterator_unary_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_kron_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_kron_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_kthvalue_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_ldexp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_ldexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_le_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_lerp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_lerp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_lgamma_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_cholesky_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_cholesky_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_cholesky_ex_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_cholesky_ex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_cond_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_cond_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_cross_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_cross_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_det_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_det_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_diagonal_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_diagonal_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_eig_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_eig_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_eigh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_eigh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_eigvals_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_eigvals_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_eigvalsh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_eigvalsh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_householder_product_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_householder_product_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_inv_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_inv_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_inv_ex_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_inv_ex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_ldl_factor_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_ldl_factor_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_ldl_factor_ex_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_ldl_factor_ex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_ldl_solve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_ldl_solve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_lstsq_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_lstsq_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_lstsq_grad_oriented_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_lstsq_grad_oriented_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_lu_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_lu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_lu_factor_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_lu_factor_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_lu_factor_ex_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_lu_factor_ex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_lu_solve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_lu_solve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_matrix_norm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_matrix_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_matrix_power_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_matrix_power_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_matrix_rank_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_matrix_rank_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_matrix_rank_hermitian_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_matrix_rank_hermitian_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_multi_dot_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_multi_dot_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_norm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_norm_subgradients_at_zero_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_pinv_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_pinv_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_pinv_hermitian_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_pinv_hermitian_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_pinv_singular_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_pinv_singular_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_qr_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_qr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_slogdet_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_slogdet_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_solve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_solve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_solve_ex_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_solve_ex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_solve_triangular_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_solve_triangular_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_svd_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_svd_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_svdvals_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_svdvals_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_tensorinv_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_tensorinv_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_tensorsolve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_tensorsolve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_vander_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_vander_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_vecdot_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_vecdot_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_vector_norm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linalg_vector_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linspace_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linspace_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linspace_tensor_overload_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_linspace_tensor_overload_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_log10_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_log10_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_log1p_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_log1p_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_log2_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_log2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_log_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_log_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_log_normal_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_log_softmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_log_softmax_with_dtype_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_log_softmax_with_dtype_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logaddexp2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logaddexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logcumsumexp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logcumsumexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logdet_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logdet_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logical_and_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logical_and_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logical_not_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logical_not_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logical_or_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logical_or_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logical_xor_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logical_xor_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logit_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logspace_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logspace_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logspace_tensor_overload_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logspace_tensor_overload_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logsumexp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_logsumexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_long_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_long_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_lt_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_lu_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_lu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_lu_solve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_lu_solve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_lu_unpack_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_lu_unpack_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_mH_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_mH_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_mT_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_mT_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_amax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_amin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_argmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_argmin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_cumprod_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_cumprod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_cumsum_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_cumsum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_fill_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_fill_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_log_softmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_logaddexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_logsumexp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_logsumexp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_mean_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_median_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_normalize_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_normalize_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_prod_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_prod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_scatter_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_scatter_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_select_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_select_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_softmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_softmin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_std_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_std_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_sum_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_sum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_var_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_masked_var_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_matmul_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_matmul_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_matrix_exp_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_matrix_exp_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_max_binary_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_max_pool2d_with_indices_backward_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_max_reduction_no_dim_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_max_reduction_with_dim_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_maximum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_mean_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_median_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_meshgrid_list_of_tensors_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_meshgrid_list_of_tensors_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_meshgrid_variadic_tensors_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_meshgrid_variadic_tensors_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_min_binary_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_min_reduction_no_dim_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_min_reduction_with_dim_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_minimum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_mm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_mm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_mode_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_movedim_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_movedim_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_msort_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_mul_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_mul_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_multinomial_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_mv_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_mv_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nan_to_num_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nanmean_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nanmean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nanmedian_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nanquantile_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nansum_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nansum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_narrow_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_narrow_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_narrow_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_narrow_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_native_batch_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_native_dropout_backward_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_native_layer_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_ne_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_ne_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_neg_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_neg_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_new_empty_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_new_empty_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_new_empty_strided_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_new_empty_strided_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_new_full_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_new_full_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_new_ones_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_new_ones_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_new_zeros_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_new_zeros_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nextafter_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_alpha_dropout_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_avg_pool1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_avg_pool2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_avg_pool3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_batch_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_bilinear_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_binary_cross_entropy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_celu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_channel_shuffle_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_channel_shuffle_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_conv1d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_conv1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_conv2d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_conv2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_conv3d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_conv3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_conv_transpose1d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_conv_transpose1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_conv_transpose2d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_conv_transpose2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_conv_transpose3d_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_conv_transpose3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_cosine_embedding_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_cosine_similarity_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_cross_entropy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_ctc_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_dropout2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_dropout3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_dropout_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_elu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_embedding_bag_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_embedding_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_fractional_max_pool2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_fractional_max_pool3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_gaussian_nll_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_gelu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_glu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_grid_sample_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_group_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_hardshrink_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_hardsigmoid_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_hardswish_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_hardtanh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_huber_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_instance_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_interpolate_area_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_interpolate_bicubic_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_interpolate_bilinear_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_interpolate_linear_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_interpolate_nearest_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_interpolate_trilinear_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_kl_div_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_l1_loss_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_l1_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_layer_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_leaky_relu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_linear_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_linear_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_local_response_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_logsigmoid_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_max_pool1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_max_pool2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_max_pool3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_max_unpool1d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_max_unpool1d_grad_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_max_unpool2d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_max_unpool2d_grad_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_max_unpool3d_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_max_unpool3d_grad_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_mish_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_mse_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_multi_head_attention_forward_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_multi_margin_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_multilabel_margin_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_nll_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_normalize_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_normalize_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_pad_circular_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_pad_circular_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_pad_constant_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_pad_constant_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_pad_reflect_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_pad_reflect_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_pad_replicate_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_pad_replicate_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_pad_replicate_negative_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_pad_replicate_negative_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_pairwise_distance_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_pairwise_distance_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_pdist_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_pixel_shuffle_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_pixel_unshuffle_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_prelu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_relu6_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_relu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_rms_norm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_rms_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_rrelu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_selu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_silu_complex_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_silu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_soft_margin_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_softmin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_softmin_with_dtype_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_softplus_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_softshrink_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_softsign_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_softsign_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_tanhshrink_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_tanhshrink_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_threshold_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_unfold_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_unfold_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_upsample_bilinear_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nn_functional_upsample_nearest_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nonzero_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nonzero_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nonzero_static_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_nonzero_static_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_norm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_norm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_norm_fro_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_norm_fro_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_norm_inf_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_norm_inf_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_norm_nuc_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_norm_nuc_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_normal_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_normal_in_place_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_normal_in_place_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_normal_number_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_ones_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_ones_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_ones_like_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_ones_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_ormqr_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_ormqr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_outer_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_outer_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_pca_lowrank_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_pca_lowrank_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_permute_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_permute_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_permute_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_permute_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_pinverse_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_pinverse_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_polar_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_polygamma_polygamma_n_0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_polygamma_polygamma_n_1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_polygamma_polygamma_n_2_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_polygamma_polygamma_n_3_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_polygamma_polygamma_n_4_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_positive_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_positive_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_pow_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_pow_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_prod_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_prod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_put_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_put_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_qr_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_qr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_quantile_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_rad2deg_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_rand_like_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_rand_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_randint_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_randint_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_randn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_randn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_randn_like_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_randn_like_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_ravel_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_ravel_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_real_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_real_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_reciprocal_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_reciprocal_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_remainder_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_renorm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_renorm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_repeat_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_repeat_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_repeat_interleave_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_repeat_interleave_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_reshape_as_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_reshape_as_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_reshape_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_reshape_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_resize__cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_resize__cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_resize_as__cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_resize_as__cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_resolve_conj_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_resolve_conj_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_resolve_neg_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_resolve_neg_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_roll_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_roll_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_rot90_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_rot90_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_round_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_round_decimals_0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_round_decimals_3_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_round_decimals_neg_3_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_rsqrt_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_rsqrt_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_rsub_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_rsub_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_scalar_tensor_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_scalar_tensor_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_scatter_add_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_scatter_add_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_scatter_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_scatter_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_scatter_reduce_amax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_scatter_reduce_amin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_scatter_reduce_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_scatter_reduce_prod_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_scatter_reduce_sum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_searchsorted_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_select_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_select_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_select_scatter_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_sgn_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_sgn_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_short_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_short_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_sigmoid_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_sigmoid_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_sign_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_signal_windows_bartlett_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_signal_windows_blackman_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_signal_windows_cosine_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_signal_windows_exponential_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_signal_windows_gaussian_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_signal_windows_general_cosine_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_signal_windows_general_hamming_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_signal_windows_hamming_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_signal_windows_hann_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_signal_windows_kaiser_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_signal_windows_nuttall_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_signbit_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_sin_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_sin_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_sinc_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_sinc_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_sinh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_sinh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_slice_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_slice_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_slice_scatter_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_softmax_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_softmax_with_dtype_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_softmax_with_dtype_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_sort_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_sparse_mm_reduce_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_sparse_sampled_addmm_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_sparse_sampled_addmm_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_airy_ai_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_bessel_j0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_bessel_j1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_bessel_y0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_bessel_y1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_chebyshev_polynomial_t_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_chebyshev_polynomial_u_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_chebyshev_polynomial_v_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_chebyshev_polynomial_w_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_entr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_erfcx_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_hermite_polynomial_h_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_hermite_polynomial_he_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_i0e_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_i1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_i1e_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_laguerre_polynomial_l_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_legendre_polynomial_p_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_log_ndtr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_modified_bessel_i0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_modified_bessel_i1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_modified_bessel_k0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_modified_bessel_k1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_ndtr_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_ndtri_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_scaled_modified_bessel_k0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_scaled_modified_bessel_k1_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_spherical_bessel_j0_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_xlog1py_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_special_zeta_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_split_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_split_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_split_list_args_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_split_list_args_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_split_with_sizes_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_split_with_sizes_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_split_with_sizes_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_split_with_sizes_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_sqrt_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_sqrt_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_square_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_square_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_squeeze_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_squeeze_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_squeeze_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_squeeze_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_squeeze_multiple_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_squeeze_multiple_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_stack_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_stack_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_std_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_std_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_std_mean_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_std_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_std_mean_unbiased_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_std_mean_unbiased_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_std_unbiased_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_std_unbiased_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_stft_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_stft_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_sub_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_sub_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_sum_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_sum_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_sum_to_size_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_sum_to_size_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_svd_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_svd_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_svd_lowrank_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_svd_lowrank_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_t_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_t_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_t_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_t_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_take_along_dim_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_take_along_dim_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_take_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_take_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_tan_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_tan_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_tanh_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_tanh_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_tensor_split_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_tensor_split_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_tensordot_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_tensordot_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_tile_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_tile_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_to_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_to_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_to_sparse_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_to_sparse_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_topk_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_trace_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_trace_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_transpose_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_transpose_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_transpose_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_transpose_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_trapezoid_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_trapezoid_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_trapz_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_trapz_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_triangular_solve_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_triangular_solve_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_tril_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_tril_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_triu_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_triu_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_true_divide_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_true_divide_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_trunc_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unbind_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unbind_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unbind_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unbind_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unflatten_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unflatten_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unfold_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unfold_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unfold_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unfold_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_uniform_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_uniform_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unique_consecutive_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unique_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unsafe_chunk_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unsafe_chunk_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unsafe_split_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unsafe_split_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unsqueeze_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unsqueeze_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unsqueeze_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_unsqueeze_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_var_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_var_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_var_mean_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_var_mean_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_var_mean_unbiased_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_var_mean_unbiased_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_var_unbiased_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_var_unbiased_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_vdot_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_vdot_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_view_as_complex_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_view_as_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_view_as_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_view_as_real_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_view_copy_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_view_copy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_view_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_view_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_vsplit_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_vsplit_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_vstack_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_vstack_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_where_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_where_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_xlogy_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_zero__cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_zero__cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_zeros_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_zeros_cuda_float64, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_zeros_like_cuda_complex128, test/test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_inplace_forward_mode_AD_zeros_like_cuda_float64 2025-08-14T22:46:45.1515636Z 2025-08-14T22:46:46.3805974Z 2025-08-14T22:46:46.3807141Z test_cuda_sanitizer 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_cuda_sanitizer_1.1_18743aaa2f74fa9a_.log 2025-08-14T22:46:46.3818504Z Running 31 items in this shard: test/test_cuda_sanitizer.py::TestArgumentHandler::test_add, test/test_cuda_sanitizer.py::TestArgumentHandler::test_cat, test/test_cuda_sanitizer.py::TestArgumentHandler::test_inplace, test/test_cuda_sanitizer.py::TestArgumentHandler::test_nonzero, test/test_cuda_sanitizer.py::TestArgumentHandler::test_out, test/test_cuda_sanitizer.py::TestArgumentHandler::test_split, test/test_cuda_sanitizer.py::TestArgumentHandler::test_tensor_names, test/test_cuda_sanitizer.py::TestEventHandler::test_all_reads_checked_failing, test/test_cuda_sanitizer.py::TestEventHandler::test_all_reads_checked_passing, test/test_cuda_sanitizer.py::TestEventHandler::test_branch_sync, test/test_cuda_sanitizer.py::TestEventHandler::test_chain_sync, test/test_cuda_sanitizer.py::TestEventHandler::test_correct_state_merging, test/test_cuda_sanitizer.py::TestEventHandler::test_deleted_record, test/test_cuda_sanitizer.py::TestEventHandler::test_device_synchronization_expired, test/test_cuda_sanitizer.py::TestEventHandler::test_device_synchronize, test/test_cuda_sanitizer.py::TestEventHandler::test_empty_kernel_launch, test/test_cuda_sanitizer.py::TestEventHandler::test_event_synchronize, test/test_cuda_sanitizer.py::TestEventHandler::test_expired_record, test/test_cuda_sanitizer.py::TestEventHandler::test_multiple_errors, test/test_cuda_sanitizer.py::TestEventHandler::test_multiple_wait, test/test_cuda_sanitizer.py::TestEventHandler::test_new_stream_is_synchronized, test/test_cuda_sanitizer.py::TestEventHandler::test_reads_check_last_write, test/test_cuda_sanitizer.py::TestEventHandler::test_record_override, test/test_cuda_sanitizer.py::TestEventHandler::test_simple_error, test/test_cuda_sanitizer.py::TestEventHandler::test_simple_passing, test/test_cuda_sanitizer.py::TestEventHandler::test_simple_sync, test/test_cuda_sanitizer.py::TestEventHandler::test_stream_synchronize, test/test_cuda_sanitizer.py::TestMessages::test_ensure_does_not_exist, test/test_cuda_sanitizer.py::TestMessages::test_ensure_exists, test/test_cuda_sanitizer.py::TestMessages::test_error_message, test/test_cuda_sanitizer.py::TestMessages::test_subclass 2025-08-14T22:46:46.3829127Z 2025-08-14T22:46:49.0533471Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:46:49.0534803Z import pkg_resources 2025-08-14T22:46:49.4373445Z Running test_monitor 1/1 ... [2025-08-14 22:46:49.436827] 2025-08-14T22:46:49.4373977Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:46:49.4375828Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_monitor.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:46:49.437224] 2025-08-14T22:46:50.5701069Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:46:50.5702412Z import pkg_resources 2025-08-14T22:46:50.9567967Z Running profiler/test_cpp_thread 1/1 ... [2025-08-14 22:46:50.956230] 2025-08-14T22:46:50.9568605Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:46:50.9570495Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_cpp_thread.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:46:50.956655] 2025-08-14T22:46:54.2614476Z 2025-08-14T22:46:54.2615418Z test_monitor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_monitor_1.1_afb69103250d24ce_.log 2025-08-14T22:46:54.2617934Z Running 6 items in this shard: test/test_monitor.py::TestMonitor::test_event_handler, test/test_monitor.py::TestMonitor::test_fixed_count_stat, test/test_monitor.py::TestMonitor::test_interval_stat, test/test_monitor.py::TestMonitor::test_log_event, test/test_monitor.py::TestMonitor::test_wait_counter, test/test_monitor.py::TestMonitorTensorboard::test_event_handler 2025-08-14T22:46:54.2619812Z 2025-08-14T22:46:55.5814527Z 2025-08-14T22:46:55.5815454Z test_ops_gradients 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_gradients_1.1_956ae068c06984d0_.log 2025-08-14T22:46:55.7739132Z Running 5384 items in this shard: test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_H_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_H_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_NumpyCatCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_NumpyCubeCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_NumpyMulCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_NumpyMulScalarCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_NumpyNMSCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_NumpyNonzeroCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_NumpySortCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_NumpySplitCopyCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_NumpySplitCopyWithIntCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_NumpyTakeCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_NumpyViewCopyCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_T_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_T_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___getitem___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___getitem___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___radd___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___radd___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rdiv___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rdiv___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rmatmul___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rmatmul___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rmod___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rmul___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rmul___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rpow___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rpow___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rsub___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad___rsub___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad__batch_norm_with_update_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad__chunk_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad__chunk_cat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad__native_batch_norm_legit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad__segment_reduce_lengths_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad__segment_reduce_offsets_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad__softmax_backward_data_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad__unsafe_masked_index_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad__unsafe_masked_index_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad__unsafe_masked_index_put_accumulate_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad__upsample_bilinear2d_aa_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_abs_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_abs_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_acos_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_acos_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_acosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_acosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addbmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addbmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addcdiv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addcdiv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addcmul_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addcmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addmm_decomposed_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addmm_decomposed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addmv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addmv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_addr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_alias_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_alias_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_all_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_all_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_allclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_allclose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_aminmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_angle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_angle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_any_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_any_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_arange_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_argmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_argmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_argsort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_argwhere_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_argwhere_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_as_strided_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_as_strided_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_as_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_as_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_as_strided_partial_views_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_as_strided_partial_views_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_as_strided_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_as_strided_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_asin_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_asin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_asinh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_asinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_atan2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_atan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_atan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_atanh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_atanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_atleast_1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_atleast_1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_atleast_2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_atleast_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_atleast_3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_atleast_3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_baddbmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_baddbmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_bernoulli_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_bfloat16_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_bfloat16_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_block_diag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_block_diag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_bmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_bmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_bool_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_bool_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_broadcast_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_broadcast_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_broadcast_to_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_broadcast_to_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_bucketize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_byte_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_byte_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cartesian_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cartesian_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cauchy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cdist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cdouble_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cdouble_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ceil_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cfloat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cfloat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_chalf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_chalf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_char_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_char_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cholesky_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cholesky_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cholesky_inverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cholesky_inverse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cholesky_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cholesky_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_chunk_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_chunk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_clamp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_clamp_max_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_clamp_min_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_clone_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_clone_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_column_stack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_column_stack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_combinations_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_combinations_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_conj_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_conj_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_conj_physical_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_conj_physical_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_constant_pad_nd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_constant_pad_nd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_contiguous_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_contiguous_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_copysign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_corrcoef_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_corrcoef_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cos_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cos_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_count_nonzero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_count_nonzero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cov_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cov_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cross_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cross_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cummax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cummin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cumprod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cumprod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cumsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cumsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cumulative_trapezoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_cumulative_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_deg2rad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diag_embed_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diag_embed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diagflat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diagflat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diagonal_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diagonal_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diagonal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diagonal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diagonal_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diagonal_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diff_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_diff_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_digamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_dist_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_dist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_div_floor_rounding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_div_no_rounding_mode_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_div_no_rounding_mode_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_div_trunc_rounding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_dot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_dot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_double_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_double_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_dsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_dsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_dstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_dstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_einsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_einsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_empty_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_empty_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_empty_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_empty_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_empty_permuted_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_empty_permuted_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_empty_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_empty_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_eq_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_eq_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_equal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_equal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_erf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_erfc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_erfinv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_exp2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_exp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_exp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_expand_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_expand_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_expand_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_expand_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_expand_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_expand_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_expm1_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_expm1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_exponential_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_eye_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_eye_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_fft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_fft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_fft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_fft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_fftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_fftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_fftshift_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_fftshift_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_hfft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_hfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_hfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_hfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_hfftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_hfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_ifft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_ifft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_ifft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_ifft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_ifftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_ifftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_ifftshift_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_ifftshift_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_ihfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_ihfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_ihfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_irfft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_irfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_irfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_irfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_irfftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_irfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_rfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_rfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fft_rfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_flatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_flatten_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_flip_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_flip_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fliplr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fliplr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_flipud_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_flipud_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_float_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_float_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_float_power_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_float_power_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_floor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_floor_divide_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_fmod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_frac_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_frexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_full_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_full_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_full_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_full_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_gather_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_gather_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ge_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_geometric_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_geqrf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_geqrf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_gradient_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_gradient_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_grid_sampler_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_gt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_half_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_half_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_hash_tensor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_heaviside_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_histc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_hsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_hsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_hstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_hstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_hypot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_i0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_igamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_igammac_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_imag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_put_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_put_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_reduce_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_reduce_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_reduce_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_reduce_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_index_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_inner_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_inner_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_int_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_int_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isclose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isfinite_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isfinite_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isinf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isinf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isnan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isnan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isneginf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isposinf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isreal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_isreal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_istft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_item_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_item_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_jiterator_2inputs_2outputs_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_jiterator_2inputs_2outputs_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_jiterator_4inputs_with_extra_args_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_jiterator_binary_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_jiterator_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_jiterator_binary_return_by_ref_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_jiterator_binary_return_by_ref_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_jiterator_unary_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_jiterator_unary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_kron_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_kron_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_kthvalue_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ldexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ldexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_le_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lerp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lerp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lgamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_cholesky_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_cholesky_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_cholesky_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_cholesky_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_cond_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_cond_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_cross_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_cross_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_det_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_det_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_diagonal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_diagonal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_eig_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_eig_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_eigh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_eigh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_eigvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_eigvals_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_eigvalsh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_eigvalsh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_householder_product_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_householder_product_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_inv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_inv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_inv_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_inv_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_ldl_factor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_ldl_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_ldl_factor_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_ldl_factor_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_ldl_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_ldl_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_lstsq_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_lstsq_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_lstsq_grad_oriented_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_lstsq_grad_oriented_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_lu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_lu_factor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_lu_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_lu_factor_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_lu_factor_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_lu_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_lu_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_matrix_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_matrix_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_matrix_power_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_matrix_power_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_matrix_rank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_matrix_rank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_matrix_rank_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_matrix_rank_hermitian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_multi_dot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_multi_dot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_norm_subgradients_at_zero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_pinv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_pinv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_pinv_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_pinv_hermitian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_pinv_singular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_pinv_singular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_qr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_qr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_slogdet_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_slogdet_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_solve_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_solve_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_solve_triangular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_solve_triangular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_svd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_svdvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_svdvals_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_tensorinv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_tensorinv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_tensorsolve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_tensorsolve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_vander_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_vander_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_vecdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_vecdot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_vector_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linalg_vector_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linspace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linspace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linspace_tensor_overload_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_linspace_tensor_overload_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log10_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log10_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log1p_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log1p_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log_normal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_log_softmax_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logaddexp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logaddexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logcumsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logcumsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logdet_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logdet_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logical_and_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logical_and_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logical_not_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logical_not_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logical_or_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logical_or_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logical_xor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logical_xor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logspace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logspace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logspace_tensor_overload_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logspace_tensor_overload_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_logsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_long_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_long_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lu_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lu_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lu_unpack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_lu_unpack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mH_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mH_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mT_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mT_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_argmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_argmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_cumprod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_cumprod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_cumsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_cumsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_log_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_logaddexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_logsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_logsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_median_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_normalize_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_normalize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_softmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_std_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_std_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_var_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_masked_var_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_matmul_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_matmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_matrix_exp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_matrix_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_max_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_max_pool2d_with_indices_backward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_max_reduction_no_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_max_reduction_with_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_maximum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_median_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_meshgrid_list_of_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_meshgrid_list_of_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_meshgrid_variadic_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_meshgrid_variadic_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_min_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_min_reduction_no_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_min_reduction_with_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_minimum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mode_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_movedim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_movedim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_msort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mul_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_multinomial_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nan_to_num_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nanmean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nanmean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nanmedian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nanquantile_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nansum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nansum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_narrow_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_narrow_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_narrow_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_narrow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_native_batch_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_native_dropout_backward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_native_layer_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ne_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ne_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_neg_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_neg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_new_empty_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_new_empty_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_new_empty_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_new_empty_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_new_full_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_new_full_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_new_ones_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_new_ones_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_new_zeros_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_new_zeros_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nextafter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_alpha_dropout_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_avg_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_avg_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_avg_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_batch_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_binary_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_celu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_channel_shuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_channel_shuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_conv1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_conv1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_conv2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_conv2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_conv3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_conv3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_conv_transpose1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_conv_transpose1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_conv_transpose2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_conv_transpose2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_conv_transpose3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_conv_transpose3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_cosine_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_cosine_similarity_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_ctc_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_dropout2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_dropout3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_dropout_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_elu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_embedding_bag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_embedding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_fractional_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_fractional_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_gaussian_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_gelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_glu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_grid_sample_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_group_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_hardshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_hardsigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_hardswish_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_hardtanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_huber_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_instance_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_interpolate_area_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_interpolate_bicubic_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_interpolate_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_interpolate_linear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_interpolate_nearest_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_interpolate_trilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_kl_div_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_l1_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_l1_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_layer_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_leaky_relu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_linear_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_linear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_local_response_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_logsigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_max_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_max_unpool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_max_unpool1d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_max_unpool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_max_unpool2d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_max_unpool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_max_unpool3d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_mish_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_mse_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_multi_head_attention_forward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_multi_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_multilabel_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_normalize_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_normalize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_circular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_circular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_constant_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_constant_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_reflect_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_reflect_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_replicate_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_replicate_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_replicate_negative_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pad_replicate_negative_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pairwise_distance_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pairwise_distance_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pdist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pixel_shuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_pixel_unshuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_prelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_relu6_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_relu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_rms_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_rms_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_rrelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_selu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_silu_complex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_silu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_softmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_softmin_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_softplus_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_softshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_softsign_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_softsign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_tanhshrink_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_tanhshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_threshold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_unfold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_upsample_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nn_functional_upsample_nearest_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nonzero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nonzero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nonzero_static_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_nonzero_static_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_norm_fro_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_norm_fro_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_norm_inf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_norm_inf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_norm_nuc_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_norm_nuc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_normal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_normal_in_place_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_normal_in_place_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_normal_number_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ones_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ones_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ones_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ones_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ormqr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ormqr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_outer_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_outer_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_pca_lowrank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_pca_lowrank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_permute_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_permute_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_permute_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_permute_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_pinverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_pinverse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_polar_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_polygamma_polygamma_n_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_polygamma_polygamma_n_1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_polygamma_polygamma_n_2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_polygamma_polygamma_n_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_polygamma_polygamma_n_4_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_positive_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_positive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_pow_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_pow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_put_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_put_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_qr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_qr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_quantile_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rad2deg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rand_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rand_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_randint_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_randint_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_randn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_randn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_randn_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_randn_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ravel_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_ravel_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_real_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_real_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_reciprocal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_reciprocal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_remainder_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_renorm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_renorm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_repeat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_repeat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_repeat_interleave_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_repeat_interleave_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_reshape_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_reshape_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_reshape_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_reshape_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_resize__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_resize__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_resize_as__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_resize_as__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_resolve_conj_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_resolve_conj_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_resolve_neg_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_resolve_neg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_roll_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_roll_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rot90_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rot90_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_round_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_round_decimals_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_round_decimals_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_round_decimals_neg_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rsqrt_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rsqrt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rsub_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_rsub_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_scalar_tensor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_scalar_tensor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_scatter_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_scatter_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_scatter_reduce_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_scatter_reduce_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_scatter_reduce_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_scatter_reduce_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_scatter_reduce_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_searchsorted_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_select_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sgn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sgn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_short_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_short_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sigmoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signal_windows_bartlett_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signal_windows_blackman_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signal_windows_cosine_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signal_windows_exponential_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signal_windows_gaussian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signal_windows_general_cosine_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signal_windows_general_hamming_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signal_windows_hamming_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signal_windows_hann_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signal_windows_kaiser_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signal_windows_nuttall_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_signbit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sin_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sinc_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sinc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sinh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_slice_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_slice_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_slice_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_softmax_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sparse_mm_reduce_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sparse_sampled_addmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sparse_sampled_addmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_airy_ai_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_bessel_j0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_bessel_j1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_bessel_y0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_bessel_y1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_chebyshev_polynomial_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_chebyshev_polynomial_u_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_chebyshev_polynomial_v_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_chebyshev_polynomial_w_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_entr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_erfcx_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_hermite_polynomial_h_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_hermite_polynomial_he_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_i0e_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_i1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_i1e_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_laguerre_polynomial_l_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_legendre_polynomial_p_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_log_ndtr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_modified_bessel_i0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_modified_bessel_i1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_modified_bessel_k0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_ndtr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_ndtri_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_scaled_modified_bessel_k0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_scaled_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_spherical_bessel_j0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_xlog1py_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_special_zeta_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_split_list_args_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_split_list_args_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_split_with_sizes_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_split_with_sizes_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_split_with_sizes_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_split_with_sizes_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sqrt_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sqrt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_square_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_square_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_squeeze_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_squeeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_squeeze_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_squeeze_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_squeeze_multiple_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_squeeze_multiple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_stack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_stack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_std_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_std_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_std_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_std_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_std_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_std_mean_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_std_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_std_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_stft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_stft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sub_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sub_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sum_to_size_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_sum_to_size_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_svd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_svd_lowrank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_svd_lowrank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_t_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_t_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_t_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_take_along_dim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_take_along_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_take_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_take_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tanh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tensor_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tensor_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tensordot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tensordot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tile_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tile_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_to_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_to_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_to_sparse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_to_sparse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_topk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_trace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_trace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_transpose_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_transpose_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_transpose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_transpose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_trapezoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_trapz_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_trapz_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_triangular_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_triangular_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tril_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_tril_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_triu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_triu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_true_divide_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_true_divide_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_trunc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unbind_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unbind_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unbind_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unbind_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unflatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unflatten_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unfold_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unfold_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unfold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_uniform_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_uniform_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unique_consecutive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unique_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unsafe_chunk_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unsafe_chunk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unsafe_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unsafe_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unsqueeze_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unsqueeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unsqueeze_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_unsqueeze_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_var_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_var_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_var_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_var_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_var_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_var_mean_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_var_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_var_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_vdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_vdot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_view_as_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_view_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_view_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_view_as_real_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_view_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_view_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_view_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_view_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_vsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_vsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_vstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_vstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_where_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_where_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_xlogy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_zero__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_zero__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_zeros_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_zeros_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_zeros_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_fail_gradgrad_zeros_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_H_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_H_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_NumpyCatCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_NumpyCubeCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_NumpyMulCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_NumpyMulScalarCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_NumpyNMSCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_NumpyNonzeroCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_NumpySortCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_NumpySplitCopyCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_NumpySplitCopyWithIntCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_NumpyTakeCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_NumpyViewCopyCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_T_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_T_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___getitem___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___getitem___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___radd___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___radd___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rdiv___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rdiv___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rmatmul___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rmatmul___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rmod___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rmul___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rmul___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rpow___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rpow___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rsub___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad___rsub___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__batch_norm_with_update_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__chunk_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__chunk_cat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__native_batch_norm_legit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__segment_reduce_lengths_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__segment_reduce_offsets_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__softmax_backward_data_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__unsafe_masked_index_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__unsafe_masked_index_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__unsafe_masked_index_put_accumulate_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad__upsample_bilinear2d_aa_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_abs_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_abs_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_acos_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_acos_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_acosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_acosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addbmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addbmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addcdiv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addcdiv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addcmul_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addcmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addmm_decomposed_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addmm_decomposed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addmv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addmv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_addr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_alias_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_alias_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_all_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_all_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_allclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_allclose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_aminmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_angle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_angle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_any_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_any_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_arange_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_argmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_argmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_argsort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_argwhere_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_argwhere_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_as_strided_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_as_strided_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_as_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_as_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_as_strided_partial_views_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_as_strided_partial_views_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_as_strided_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_as_strided_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_asin_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_asin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_asinh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_asinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_atan2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_atan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_atan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_atanh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_atanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_atleast_1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_atleast_1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_atleast_2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_atleast_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_atleast_3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_atleast_3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_auto_functionalize_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_baddbmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_baddbmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_bernoulli_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_bfloat16_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_bfloat16_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_block_diag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_block_diag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_bmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_bmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_bool_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_bool_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_broadcast_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_broadcast_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_broadcast_to_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_broadcast_to_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_bucketize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_byte_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_byte_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cartesian_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cartesian_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cauchy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cdist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cdouble_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cdouble_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ceil_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cfloat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cfloat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_chalf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_chalf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_char_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_char_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cholesky_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cholesky_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cholesky_inverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cholesky_inverse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cholesky_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cholesky_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_chunk_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_chunk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_clamp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_clamp_max_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_clamp_min_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_clone_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_clone_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_column_stack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_column_stack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_combinations_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_combinations_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cond_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_conj_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_conj_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_conj_physical_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_conj_physical_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_constant_pad_nd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_constant_pad_nd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_contiguous_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_contiguous_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_copysign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_corrcoef_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_corrcoef_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cos_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cos_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_count_nonzero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_count_nonzero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cov_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cov_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cross_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cross_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cummax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cummin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cumprod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cumprod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cumsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cumsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cumulative_trapezoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_cumulative_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_deg2rad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diag_embed_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diag_embed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diagflat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diagflat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diagonal_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diagonal_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diagonal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diagonal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diagonal_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diagonal_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diff_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_diff_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_digamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_dist_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_dist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_div_floor_rounding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_div_no_rounding_mode_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_div_no_rounding_mode_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_div_trunc_rounding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_dot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_dot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_double_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_double_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_dsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_dsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_dstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_dstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_einsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_einsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_empty_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_empty_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_empty_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_empty_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_empty_permuted_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_empty_permuted_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_empty_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_empty_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_eq_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_eq_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_equal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_equal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_erf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_erfc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_erfinv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_exp2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_exp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_exp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_expand_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_expand_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_expand_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_expand_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_expand_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_expand_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_expm1_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_expm1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_exponential_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_eye_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_eye_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_fft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_fft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_fft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_fft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_fftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_fftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_fftshift_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_fftshift_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_hfft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_hfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_hfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_hfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_hfftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_hfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_ifft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_ifft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_ifft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_ifft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_ifftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_ifftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_ifftshift_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_ifftshift_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_ihfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_ihfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_ihfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_irfft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_irfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_irfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_irfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_irfftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_irfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_rfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_rfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fft_rfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_flatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_flatten_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_flip_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_flip_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fliplr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fliplr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_flipud_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_flipud_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_float_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_float_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_float_power_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_float_power_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_floor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_floor_divide_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_fmod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_frac_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_frexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_full_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_full_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_full_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_full_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_gather_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_gather_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ge_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_geometric_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_geqrf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_geqrf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_gradient_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_gradient_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_grid_sampler_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_gt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_half_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_half_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_hash_tensor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_heaviside_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_histc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_hsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_hsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_hstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_hstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_hypot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_i0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_igamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_igammac_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_imag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_put_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_put_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_reduce_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_reduce_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_reduce_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_reduce_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_index_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_inner_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_inner_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_int_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_int_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_invoke_quant_packed_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_invoke_quant_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_invoke_subgraph_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isclose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isfinite_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isfinite_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isinf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isinf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isnan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isnan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isneginf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isposinf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isreal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_isreal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_istft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_item_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_item_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_jiterator_2inputs_2outputs_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_jiterator_2inputs_2outputs_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_jiterator_4inputs_with_extra_args_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_jiterator_binary_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_jiterator_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_jiterator_binary_return_by_ref_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_jiterator_binary_return_by_ref_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_jiterator_unary_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_jiterator_unary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_kron_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_kron_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_kthvalue_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ldexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ldexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_le_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_lerp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_lerp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_lgamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_cholesky_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_cholesky_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_cholesky_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_cholesky_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_cond_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_cond_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_cross_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_cross_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_det_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_det_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_diagonal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_diagonal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_eig_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_eig_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_eigh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_eigh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_eigvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_eigvals_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_eigvalsh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_eigvalsh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_householder_product_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_householder_product_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_inv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_inv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_inv_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_inv_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_ldl_factor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_ldl_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_ldl_factor_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_ldl_factor_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_ldl_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_ldl_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_lstsq_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_lstsq_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_lstsq_grad_oriented_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_lstsq_grad_oriented_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_lu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_lu_factor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_lu_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_lu_factor_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_lu_factor_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_lu_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_lu_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_matrix_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_matrix_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_matrix_power_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_matrix_power_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_matrix_rank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_matrix_rank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_matrix_rank_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_matrix_rank_hermitian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_multi_dot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_multi_dot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_norm_subgradients_at_zero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_pinv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_pinv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_pinv_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_pinv_hermitian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_pinv_singular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_pinv_singular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_qr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_qr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_slogdet_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_slogdet_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_solve_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_solve_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_solve_triangular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_solve_triangular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_svd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_svdvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_svdvals_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_tensorinv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_tensorinv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_tensorsolve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_tensorsolve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_vander_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_vander_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_vecdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_vecdot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_vector_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_vector_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linspace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linspace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linspace_tensor_overload_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linspace_tensor_overload_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log10_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log10_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log1p_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log1p_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log_normal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_log_softmax_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logaddexp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logaddexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logcumsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logcumsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logdet_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logdet_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logical_and_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logical_and_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logical_not_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logical_not_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logical_or_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logical_or_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logical_xor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logical_xor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logspace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logspace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logspace_tensor_overload_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logspace_tensor_overload_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_logsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_long_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_long_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_lt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_lu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_lu_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_lu_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_lu_unpack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_lu_unpack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mH_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mH_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mT_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mT_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_map_nested_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_map_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_map_triple_nested_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_argmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_argmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_cumprod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_cumprod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_cumsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_cumsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_log_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_logaddexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_logsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_logsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_median_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_normalize_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_normalize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_softmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_std_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_std_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_var_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_masked_var_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_matmul_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_matmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_matrix_exp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_matrix_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_max_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_max_pool2d_with_indices_backward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_max_reduction_no_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_max_reduction_with_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_maximum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_median_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_meshgrid_list_of_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_meshgrid_list_of_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_meshgrid_variadic_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_meshgrid_variadic_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_min_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_min_reduction_no_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_min_reduction_with_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_minimum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mode_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_movedim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_movedim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_msort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mul_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_multinomial_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nan_to_num_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nanmean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nanmean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nanmedian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nanquantile_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nansum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nansum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_narrow_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_narrow_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_narrow_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_narrow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_native_batch_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_native_dropout_backward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_native_layer_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ne_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ne_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_neg_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_neg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_new_empty_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_new_empty_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_new_empty_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_new_empty_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_new_full_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_new_full_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_new_ones_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_new_ones_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_new_zeros_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_new_zeros_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nextafter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_alpha_dropout_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_avg_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_avg_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_avg_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_batch_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_binary_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_celu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_channel_shuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_channel_shuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_conv1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_conv1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_conv2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_conv2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_conv3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_conv3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_conv_transpose1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_conv_transpose1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_conv_transpose2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_conv_transpose2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_conv_transpose3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_conv_transpose3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_cosine_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_cosine_similarity_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_ctc_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_dropout2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_dropout3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_dropout_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_elu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_embedding_bag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_embedding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_fractional_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_fractional_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_gaussian_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_gelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_glu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_grid_sample_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_group_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_hardshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_hardsigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_hardswish_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_hardtanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_huber_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_instance_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_interpolate_area_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_interpolate_bicubic_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_interpolate_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_interpolate_linear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_interpolate_nearest_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_interpolate_trilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_kl_div_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_l1_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_l1_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_layer_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_leaky_relu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_linear_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_linear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_local_response_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_logsigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_max_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_max_unpool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_max_unpool1d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_max_unpool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_max_unpool2d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_max_unpool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_max_unpool3d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_mish_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_mse_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_multi_head_attention_forward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_multi_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_multilabel_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_normalize_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_normalize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_circular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_circular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_constant_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_constant_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_reflect_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_reflect_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_replicate_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_replicate_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_replicate_negative_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pad_replicate_negative_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pairwise_distance_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pairwise_distance_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pdist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pixel_shuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_pixel_unshuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_prelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_relu6_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_relu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_rms_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_rms_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_rrelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_selu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_silu_complex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_silu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_softmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_softmin_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_softplus_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_softshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_softsign_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_softsign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_tanhshrink_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_tanhshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_threshold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_unfold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_upsample_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nn_functional_upsample_nearest_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nonzero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nonzero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nonzero_static_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_nonzero_static_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_norm_fro_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_norm_fro_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_norm_inf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_norm_inf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_norm_nuc_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_norm_nuc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_normal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_normal_in_place_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_normal_in_place_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_normal_number_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ones_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ones_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ones_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ones_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ormqr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ormqr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_outer_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_outer_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_pca_lowrank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_pca_lowrank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_permute_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_permute_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_permute_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_permute_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_pinverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_pinverse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_polar_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_polygamma_polygamma_n_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_polygamma_polygamma_n_1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_polygamma_polygamma_n_2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_polygamma_polygamma_n_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_polygamma_polygamma_n_4_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_positive_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_positive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_pow_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_pow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_put_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_put_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_qr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_qr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_quantile_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_rad2deg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_rand_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_rand_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_randint_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_randint_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_randn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_randn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_randn_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_randn_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ravel_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_ravel_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_real_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_real_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_reciprocal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_reciprocal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_remainder_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_renorm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_renorm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_repeat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_repeat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_repeat_interleave_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_repeat_interleave_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_reshape_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_reshape_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_reshape_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_reshape_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_resize__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_resize__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_resize_as__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_resize_as__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_resolve_conj_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_resolve_conj_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_resolve_neg_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_resolve_neg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_roll_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_roll_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_rot90_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_rot90_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_round_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_round_decimals_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_round_decimals_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_round_decimals_neg_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_rsqrt_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_rsqrt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_rsub_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_rsub_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_scalar_tensor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_scalar_tensor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_scan_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_scatter_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_scatter_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_scatter_reduce_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_scatter_reduce_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_scatter_reduce_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_scatter_reduce_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_scatter_reduce_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_searchsorted_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_select_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sgn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sgn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_short_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_short_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sigmoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signal_windows_bartlett_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signal_windows_blackman_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signal_windows_cosine_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signal_windows_exponential_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signal_windows_gaussian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signal_windows_general_cosine_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signal_windows_general_hamming_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signal_windows_hamming_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signal_windows_hann_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signal_windows_kaiser_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signal_windows_nuttall_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_signbit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sin_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sinc_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sinc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sinh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_slice_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_slice_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_slice_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_softmax_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sparse_mm_reduce_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sparse_sampled_addmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sparse_sampled_addmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_airy_ai_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_bessel_j0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_bessel_j1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_bessel_y0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_bessel_y1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_chebyshev_polynomial_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_chebyshev_polynomial_u_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_chebyshev_polynomial_v_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_chebyshev_polynomial_w_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_entr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_erfcx_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_hermite_polynomial_h_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_hermite_polynomial_he_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_i0e_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_i1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_i1e_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_laguerre_polynomial_l_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_legendre_polynomial_p_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_log_ndtr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_modified_bessel_i0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_modified_bessel_i1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_modified_bessel_k0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_ndtr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_ndtri_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_scaled_modified_bessel_k0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_scaled_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_spherical_bessel_j0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_xlog1py_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_special_zeta_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_split_list_args_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_split_list_args_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_split_with_sizes_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_split_with_sizes_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_split_with_sizes_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_split_with_sizes_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sqrt_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sqrt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_square_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_square_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_squeeze_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_squeeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_squeeze_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_squeeze_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_squeeze_multiple_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_squeeze_multiple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_stack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_stack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_std_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_std_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_std_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_std_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_std_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_std_mean_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_std_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_std_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_stft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_stft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sub_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sub_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sum_to_size_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_sum_to_size_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_svd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_svd_lowrank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_svd_lowrank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_t_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_t_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_t_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_take_along_dim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_take_along_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_take_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_take_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tanh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tensor_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tensor_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tensordot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tensordot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tile_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tile_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_to_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_to_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_to_sparse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_to_sparse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_topk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_trace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_trace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_transpose_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_transpose_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_transpose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_transpose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_trapezoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_trapz_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_trapz_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_triangular_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_triangular_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tril_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_tril_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_triu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_triu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_true_divide_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_true_divide_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_trunc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unbind_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unbind_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unbind_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unbind_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unflatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unflatten_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unfold_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unfold_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unfold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_uniform_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_uniform_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unique_consecutive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unique_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unsafe_chunk_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unsafe_chunk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unsafe_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unsafe_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unsqueeze_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unsqueeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unsqueeze_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_unsqueeze_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_var_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_var_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_var_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_var_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_var_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_var_mean_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_var_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_var_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_vdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_vdot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_view_as_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_view_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_view_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_view_as_real_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_view_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_view_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_view_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_view_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_vsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_vsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_vstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_vstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_where_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_where_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_while_loop_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_xlogy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_zero__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_zero__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_zeros_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_zeros_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_zeros_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_zeros_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_H_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_H_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_NumpyCatCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_NumpyCubeCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_NumpyMulCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_NumpyMulScalarCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_NumpyNMSCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_NumpyNonzeroCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_NumpySortCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_NumpySplitCopyCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_NumpySplitCopyWithIntCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_NumpyTakeCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_NumpyViewCopyCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_T_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_T_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___getitem___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___getitem___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___radd___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___radd___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rdiv___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rdiv___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rmatmul___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rmatmul___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rmod___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rmul___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rmul___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rpow___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rpow___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rsub___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad___rsub___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__batch_norm_with_update_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__chunk_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__chunk_cat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__native_batch_norm_legit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__segment_reduce_lengths_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__segment_reduce_offsets_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__softmax_backward_data_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__unsafe_masked_index_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__unsafe_masked_index_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__unsafe_masked_index_put_accumulate_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad__upsample_bilinear2d_aa_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_abs_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_abs_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_acos_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_acos_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_acosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_acosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addbmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addbmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addcdiv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addcdiv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addcmul_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addcmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addmm_decomposed_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addmm_decomposed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addmv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addmv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_addr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_alias_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_alias_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_all_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_all_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_allclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_allclose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_aminmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_angle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_angle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_any_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_any_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_arange_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_argmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_argmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_argsort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_argwhere_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_argwhere_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_partial_views_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_partial_views_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_as_strided_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_asin_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_asin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_asinh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_asinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_atan2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_atan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_atan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_atanh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_atanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_atleast_1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_atleast_1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_atleast_2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_atleast_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_atleast_3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_atleast_3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_auto_functionalize_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_baddbmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_baddbmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_bernoulli_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_bfloat16_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_bfloat16_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_block_diag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_block_diag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_bmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_bmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_bool_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_bool_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_broadcast_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_broadcast_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_broadcast_to_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_broadcast_to_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_bucketize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_byte_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_byte_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cartesian_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cartesian_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cauchy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cdist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cdouble_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cdouble_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ceil_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cfloat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cfloat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_chalf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_chalf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_char_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_char_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cholesky_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cholesky_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cholesky_inverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cholesky_inverse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cholesky_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cholesky_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_chunk_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_chunk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_clamp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_clamp_max_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_clamp_min_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_clone_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_clone_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_column_stack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_column_stack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_combinations_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_combinations_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cond_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_conj_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_conj_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_conj_physical_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_conj_physical_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_constant_pad_nd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_constant_pad_nd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_contiguous_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_contiguous_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_copysign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_corrcoef_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_corrcoef_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cos_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cos_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_count_nonzero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_count_nonzero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cov_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cov_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cross_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cross_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cummax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cummin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cumprod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cumprod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cumsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cumsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cumulative_trapezoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_cumulative_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_deg2rad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diag_embed_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diag_embed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diagflat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diagflat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diagonal_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diagonal_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diagonal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diagonal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diagonal_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diagonal_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diff_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_diff_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_digamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_dist_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_dist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_div_floor_rounding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_div_no_rounding_mode_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_div_no_rounding_mode_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_div_trunc_rounding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_dot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_dot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_double_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_double_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_dsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_dsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_dstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_dstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_einsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_einsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_empty_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_empty_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_empty_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_empty_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_empty_permuted_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_empty_permuted_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_empty_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_empty_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_eq_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_eq_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_equal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_equal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_erf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_erfc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_erfinv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_exp2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_exp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_exp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_expand_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_expand_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_expand_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_expand_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_expand_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_expand_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_expm1_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_expm1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_exponential_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_eye_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_eye_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_fft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_fft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_fft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_fft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_fftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_fftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_fftshift_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_fftshift_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_hfft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_hfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_hfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_hfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_hfftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_hfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_ifft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_ifft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_ifft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_ifft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_ifftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_ifftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_ifftshift_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_ifftshift_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_ihfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_ihfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_ihfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_irfft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_irfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_irfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_irfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_irfftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_irfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_rfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_rfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fft_rfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_flatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_flatten_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_flip_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_flip_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fliplr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fliplr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_flipud_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_flipud_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_float_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_float_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_float_power_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_float_power_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_floor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_floor_divide_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_fmod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_frac_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_frexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_full_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_full_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_full_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_full_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_gather_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_gather_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ge_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_geometric_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_geqrf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_geqrf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_gradient_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_gradient_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_grid_sampler_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_gt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_half_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_half_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_hash_tensor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_heaviside_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_histc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_hsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_hsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_hstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_hstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_hypot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_i0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_igamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_igammac_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_imag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_put_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_put_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_reduce_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_reduce_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_reduce_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_reduce_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_index_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_inner_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_inner_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_int_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_int_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_invoke_quant_packed_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_invoke_quant_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_invoke_subgraph_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isclose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isfinite_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isfinite_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isinf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isinf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isnan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isnan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isneginf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isposinf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isreal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_isreal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_istft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_item_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_item_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_jiterator_2inputs_2outputs_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_jiterator_2inputs_2outputs_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_jiterator_4inputs_with_extra_args_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_jiterator_binary_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_jiterator_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_jiterator_binary_return_by_ref_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_jiterator_binary_return_by_ref_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_jiterator_unary_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_jiterator_unary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_kron_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_kron_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_kthvalue_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ldexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ldexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_le_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_lerp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_lerp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_lgamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_cholesky_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_cholesky_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_cholesky_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_cholesky_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_cond_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_cond_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_cross_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_cross_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_det_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_det_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_diagonal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_diagonal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_eig_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_eig_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_eigh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_eigh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_eigvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_eigvals_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_eigvalsh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_eigvalsh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_householder_product_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_householder_product_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_inv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_inv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_inv_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_inv_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_ldl_factor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_ldl_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_ldl_factor_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_ldl_factor_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_ldl_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_ldl_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lstsq_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lstsq_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lstsq_grad_oriented_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lstsq_grad_oriented_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lu_factor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lu_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lu_factor_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lu_factor_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lu_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_lu_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_matrix_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_matrix_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_matrix_power_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_matrix_power_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_matrix_rank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_matrix_rank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_matrix_rank_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_matrix_rank_hermitian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_multi_dot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_multi_dot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_norm_subgradients_at_zero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_pinv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_pinv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_pinv_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_pinv_hermitian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_pinv_singular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_pinv_singular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_qr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_qr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_slogdet_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_slogdet_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_solve_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_solve_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_solve_triangular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_solve_triangular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_svd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_svdvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_svdvals_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_tensorinv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_tensorinv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_tensorsolve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_tensorsolve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_vander_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_vander_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_vecdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_vecdot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_vector_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linalg_vector_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linspace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linspace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linspace_tensor_overload_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_linspace_tensor_overload_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log10_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log10_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log1p_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log1p_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log_normal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_log_softmax_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logaddexp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logaddexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logcumsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logcumsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logdet_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logdet_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logical_and_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logical_and_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logical_not_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logical_not_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logical_or_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logical_or_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logical_xor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logical_xor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logspace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logspace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logspace_tensor_overload_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logspace_tensor_overload_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_logsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_long_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_long_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_lt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_lu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_lu_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_lu_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_lu_unpack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_lu_unpack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mH_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mH_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mT_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mT_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_map_nested_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_map_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_map_triple_nested_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_argmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_argmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_cumprod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_cumprod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_cumsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_cumsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_log_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_logaddexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_logsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_logsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_median_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_normalize_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_normalize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_softmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_std_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_std_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_var_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_masked_var_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_matmul_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_matmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_matrix_exp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_matrix_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_max_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_max_pool2d_with_indices_backward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_max_reduction_no_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_max_reduction_with_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_maximum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_median_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_meshgrid_list_of_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_meshgrid_list_of_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_meshgrid_variadic_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_meshgrid_variadic_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_min_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_min_reduction_no_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_min_reduction_with_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_minimum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mode_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_movedim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_movedim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_msort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mul_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_multinomial_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nan_to_num_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nanmean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nanmean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nanmedian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nanquantile_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nansum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nansum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_narrow_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_narrow_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_narrow_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_narrow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_native_batch_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_native_dropout_backward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_native_layer_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ne_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ne_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_neg_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_neg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_new_empty_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_new_empty_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_new_empty_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_new_empty_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_new_full_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_new_full_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_new_ones_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_new_ones_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_new_zeros_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_new_zeros_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nextafter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_alpha_dropout_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_avg_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_avg_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_avg_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_batch_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_binary_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_celu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_channel_shuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_channel_shuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_conv1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_conv1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_conv2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_conv2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_conv3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_conv3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_conv_transpose1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_conv_transpose1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_conv_transpose2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_conv_transpose2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_conv_transpose3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_conv_transpose3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_cosine_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_cosine_similarity_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_ctc_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_dropout2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_dropout3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_dropout_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_elu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_embedding_bag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_embedding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_fractional_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_fractional_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_gaussian_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_gelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_glu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_grid_sample_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_group_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_hardshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_hardsigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_hardswish_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_hardtanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_huber_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_instance_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_interpolate_area_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_interpolate_bicubic_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_interpolate_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_interpolate_linear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_interpolate_nearest_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_interpolate_trilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_kl_div_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_l1_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_l1_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_layer_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_leaky_relu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_linear_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_linear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_local_response_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_logsigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_max_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_max_unpool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_max_unpool1d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_max_unpool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_max_unpool2d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_max_unpool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_max_unpool3d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_mish_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_mse_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_multi_head_attention_forward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_multi_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_multilabel_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_normalize_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_normalize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pad_circular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pad_circular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pad_constant_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pad_constant_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pad_reflect_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pad_reflect_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pad_replicate_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pad_replicate_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pad_replicate_negative_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pad_replicate_negative_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pairwise_distance_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pairwise_distance_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pdist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pixel_shuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_pixel_unshuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_prelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_relu6_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_relu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_rms_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_rms_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_rrelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_selu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_silu_complex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_silu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_softmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_softmin_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_softplus_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_softshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_softsign_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_softsign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_tanhshrink_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_tanhshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_threshold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_unfold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_upsample_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nn_functional_upsample_nearest_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nonzero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nonzero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nonzero_static_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_nonzero_static_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_norm_fro_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_norm_fro_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_norm_inf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_norm_inf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_norm_nuc_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_norm_nuc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_normal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_normal_in_place_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_normal_in_place_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_normal_number_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ones_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ones_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ones_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ones_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ormqr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ormqr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_outer_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_outer_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_pca_lowrank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_pca_lowrank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_permute_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_permute_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_permute_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_permute_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_pinverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_pinverse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_polar_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_polygamma_polygamma_n_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_polygamma_polygamma_n_1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_polygamma_polygamma_n_2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_polygamma_polygamma_n_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_polygamma_polygamma_n_4_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_positive_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_positive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_pow_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_pow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_put_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_put_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_qr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_qr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_quantile_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_rad2deg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_rand_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_rand_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_randint_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_randint_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_randn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_randn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_randn_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_randn_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ravel_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_ravel_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_real_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_real_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_reciprocal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_reciprocal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_remainder_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_renorm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_renorm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_repeat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_repeat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_repeat_interleave_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_repeat_interleave_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_reshape_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_reshape_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_reshape_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_reshape_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_resize__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_resize__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_resize_as__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_resize_as__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_resolve_conj_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_resolve_conj_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_resolve_neg_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_resolve_neg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_roll_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_roll_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_rot90_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_rot90_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_round_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_round_decimals_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_round_decimals_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_round_decimals_neg_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_rsqrt_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_rsqrt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_rsub_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_rsub_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scalar_tensor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scalar_tensor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scan_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scatter_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scatter_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scatter_reduce_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scatter_reduce_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scatter_reduce_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scatter_reduce_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_scatter_reduce_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_searchsorted_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_select_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sgn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sgn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_short_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_short_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sigmoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_signal_windows_bartlett_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_signal_windows_blackman_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_signal_windows_cosine_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_signal_windows_exponential_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_signal_windows_gaussian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_signal_windows_general_cosine_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_signal_windows_general_hamming_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_signal_windows_hamming_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_signal_windows_hann_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_signal_windows_kaiser_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_signal_windows_nuttall_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_signbit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sin_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sinc_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sinc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sinh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_slice_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_slice_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_slice_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_softmax_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sparse_mm_reduce_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sparse_sampled_addmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sparse_sampled_addmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_airy_ai_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_bessel_j0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_bessel_j1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_bessel_y0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_bessel_y1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_chebyshev_polynomial_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_chebyshev_polynomial_u_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_chebyshev_polynomial_v_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_chebyshev_polynomial_w_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_entr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_erfcx_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_hermite_polynomial_h_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_hermite_polynomial_he_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_i0e_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_i1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_i1e_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_laguerre_polynomial_l_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_legendre_polynomial_p_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_log_ndtr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_modified_bessel_i0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_modified_bessel_i1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_modified_bessel_k0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_ndtr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_ndtri_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_scaled_modified_bessel_k0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_scaled_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_spherical_bessel_j0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_xlog1py_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_special_zeta_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_split_list_args_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_split_list_args_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_split_with_sizes_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_split_with_sizes_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_split_with_sizes_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_split_with_sizes_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sqrt_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sqrt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_square_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_square_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_squeeze_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_squeeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_squeeze_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_squeeze_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_squeeze_multiple_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_squeeze_multiple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_stack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_stack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_std_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_std_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_std_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_std_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_std_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_std_mean_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_std_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_std_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_stft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_stft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sub_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sub_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sum_to_size_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_sum_to_size_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_svd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_svd_lowrank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_svd_lowrank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_t_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_t_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_t_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_take_along_dim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_take_along_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_take_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_take_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tanh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tensor_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tensor_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tensordot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tensordot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tile_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tile_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_to_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_to_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_to_sparse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_to_sparse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_topk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_trace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_trace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_transpose_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_transpose_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_transpose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_transpose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_trapezoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_trapz_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_trapz_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_triangular_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_triangular_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tril_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_tril_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_triu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_triu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_true_divide_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_true_divide_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_trunc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unbind_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unbind_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unbind_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unbind_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unflatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unflatten_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unfold_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unfold_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unfold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_uniform_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_uniform_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unique_consecutive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unique_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unsafe_chunk_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unsafe_chunk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unsafe_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unsafe_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unsqueeze_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unsqueeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unsqueeze_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_unsqueeze_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_var_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_var_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_var_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_var_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_var_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_var_mean_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_var_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_var_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_vdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_vdot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_view_as_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_view_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_view_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_view_as_real_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_view_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_view_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_view_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_view_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_vsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_vsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_vstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_vstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_where_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_where_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_while_loop_simple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_xlogy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_zero__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_zero__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_zeros_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_zeros_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_zeros_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_gradgrad_zeros_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_H_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_H_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_NumpyCatCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_NumpyCubeCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_NumpyMulCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_NumpyMulScalarCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_NumpyNMSCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_NumpyNonzeroCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_NumpySortCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_NumpySplitCopyCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_NumpySplitCopyWithIntCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_NumpyTakeCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_NumpyViewCopyCustomOp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_T_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_T_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___getitem___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___getitem___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___radd___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___radd___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rdiv___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rdiv___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rmatmul___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rmatmul___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rmod___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rmul___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rmul___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rpow___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rpow___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rsub___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad___rsub___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__batch_norm_with_update_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__chunk_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__chunk_cat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__native_batch_norm_legit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__segment_reduce_lengths_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__segment_reduce_offsets_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__softmax_backward_data_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__unsafe_masked_index_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__unsafe_masked_index_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__unsafe_masked_index_put_accumulate_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad__upsample_bilinear2d_aa_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_abs_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_abs_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_acos_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_acos_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_acosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_acosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addbmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addbmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addcdiv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addcdiv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addcmul_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addcmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addmm_decomposed_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addmm_decomposed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addmv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addmv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_addr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_alias_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_alias_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_all_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_all_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_allclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_allclose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_aminmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_angle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_angle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_any_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_any_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_arange_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_argmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_argmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_argsort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_argwhere_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_argwhere_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_as_strided_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_as_strided_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_as_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_as_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_as_strided_partial_views_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_as_strided_partial_views_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_as_strided_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_as_strided_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_asin_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_asin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_asinh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_asinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atan2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atanh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atleast_1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atleast_1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atleast_2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atleast_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atleast_3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_atleast_3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_baddbmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_baddbmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_bernoulli_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_bfloat16_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_bfloat16_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_block_diag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_block_diag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_bmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_bmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_bool_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_bool_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_broadcast_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_broadcast_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_broadcast_to_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_broadcast_to_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_bucketize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_byte_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_byte_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cartesian_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cartesian_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cauchy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cdist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cdouble_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cdouble_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_ceil_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cfloat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cfloat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_chalf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_chalf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_char_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_char_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cholesky_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cholesky_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cholesky_inverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cholesky_inverse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cholesky_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cholesky_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_chunk_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_chunk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_clamp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_clamp_max_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_clamp_min_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_clone_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_clone_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_column_stack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_column_stack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_combinations_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_combinations_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_conj_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_conj_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_conj_physical_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_conj_physical_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_constant_pad_nd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_constant_pad_nd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_contiguous_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_contiguous_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_copysign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_corrcoef_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_corrcoef_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cos_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cos_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_count_nonzero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_count_nonzero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cov_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cov_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cross_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cross_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cummax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cummin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cumprod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cumprod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cumsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cumsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cumulative_trapezoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_cumulative_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_deg2rad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diag_embed_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diag_embed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diagflat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diagflat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diagonal_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diagonal_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diagonal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diagonal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diagonal_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diagonal_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diff_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_diff_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_digamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_dist_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_dist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_div_floor_rounding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_div_no_rounding_mode_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_div_no_rounding_mode_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_div_trunc_rounding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_dot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_dot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_double_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_double_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_dsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_dsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_dstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_dstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_einsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_einsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_empty_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_empty_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_empty_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_empty_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_empty_permuted_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_empty_permuted_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_empty_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_empty_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_eq_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_eq_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_equal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_equal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_erf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_erfc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_erfinv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_exp2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_exp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_exp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_expand_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_expand_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_expand_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_expand_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_expand_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_expand_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_expm1_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_expm1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_exponential_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_eye_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_eye_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_fft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_fft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_fft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_fft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_fftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_fftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_fftshift_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_fftshift_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_hfft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_hfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_hfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_hfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_hfftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_hfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ifft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ifft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ifft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ifft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ifftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ifftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ifftshift_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ifftshift_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ihfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ihfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_ihfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_irfft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_irfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_irfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_irfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_irfftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_irfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_rfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_rfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fft_rfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_flatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_flatten_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_flip_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_flip_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fliplr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fliplr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_flipud_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_flipud_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_float_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_float_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_float_power_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_float_power_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_floor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_floor_divide_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_fmod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_frac_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_frexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_full_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_full_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_full_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_full_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_gather_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_gather_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_ge_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_geometric_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_geqrf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_geqrf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_gradient_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_gradient_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_grid_sampler_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_gt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_half_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_half_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_hash_tensor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_heaviside_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_histc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_hsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_hsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_hstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_hstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_hypot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_i0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_igamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_igammac_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_imag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_put_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_put_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_reduce_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_reduce_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_reduce_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_reduce_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_index_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_inner_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_inner_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_int_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_int_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_isclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_isclose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_isfinite_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_isfinite_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_isin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_isinf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_isinf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_isnan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_isnan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_isneginf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_isposinf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_isreal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_isreal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_istft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_item_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_item_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_jiterator_2inputs_2outputs_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_jiterator_2inputs_2outputs_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_jiterator_4inputs_with_extra_args_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_jiterator_binary_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_jiterator_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_jiterator_binary_return_by_ref_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_jiterator_binary_return_by_ref_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_jiterator_unary_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_jiterator_unary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_kron_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_kron_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_kthvalue_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_ldexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_ldexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_le_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_lerp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_lerp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_lgamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_cholesky_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_cholesky_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_cholesky_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_cholesky_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_cond_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_cond_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_cross_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_cross_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_det_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_det_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_diagonal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_diagonal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_eig_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_eig_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_eigh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_eigh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_eigvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_eigvals_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_eigvalsh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_eigvalsh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_householder_product_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_householder_product_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_inv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_inv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_inv_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_inv_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_ldl_factor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_ldl_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_ldl_factor_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_ldl_factor_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_ldl_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_ldl_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lstsq_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lstsq_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lstsq_grad_oriented_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lstsq_grad_oriented_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lu_factor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lu_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lu_factor_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lu_factor_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lu_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_lu_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_matrix_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_matrix_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_matrix_power_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_matrix_power_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_matrix_rank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_matrix_rank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_matrix_rank_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_matrix_rank_hermitian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_multi_dot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_multi_dot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_norm_subgradients_at_zero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_pinv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_pinv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_pinv_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_pinv_hermitian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_pinv_singular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_pinv_singular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_qr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_qr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_slogdet_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_slogdet_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_solve_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_solve_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_solve_triangular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_solve_triangular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_svd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_svdvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_svdvals_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_tensorinv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_tensorinv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_tensorsolve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_tensorsolve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_vander_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_vander_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_vecdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_vecdot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_vector_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linalg_vector_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linspace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linspace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linspace_tensor_overload_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_linspace_tensor_overload_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_log10_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_log10_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_log1p_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_log1p_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_log2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_log2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_log_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_log_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_log_normal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_log_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_log_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_log_softmax_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logaddexp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logaddexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logcumsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logcumsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logdet_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logdet_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logical_and_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logical_and_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logical_not_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logical_not_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logical_or_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logical_or_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logical_xor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logical_xor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logspace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logspace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logspace_tensor_overload_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logspace_tensor_overload_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_logsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_long_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_long_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_lt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_lu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_lu_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_lu_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_lu_unpack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_lu_unpack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mH_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mH_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mT_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mT_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_argmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_argmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_cumprod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_cumprod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_cumsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_cumsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_log_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_logaddexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_logsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_logsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_median_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_normalize_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_normalize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_softmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_std_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_std_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_var_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_masked_var_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_matmul_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_matmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_matrix_exp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_matrix_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_max_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_max_pool2d_with_indices_backward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_max_reduction_no_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_max_reduction_with_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_maximum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_median_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_meshgrid_list_of_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_meshgrid_list_of_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_meshgrid_variadic_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_meshgrid_variadic_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_min_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_min_reduction_no_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_min_reduction_with_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_minimum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mode_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_movedim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_movedim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_msort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mul_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_multinomial_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nan_to_num_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nanmean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nanmean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nanmedian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nanquantile_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nansum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nansum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_narrow_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_narrow_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_narrow_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_narrow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_native_batch_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_native_dropout_backward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_native_layer_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_ne_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_ne_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_neg_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_neg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_new_empty_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_new_empty_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_new_empty_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_new_empty_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_new_full_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_new_full_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_new_ones_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_new_ones_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_new_zeros_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_new_zeros_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nextafter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_alpha_dropout_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_avg_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_avg_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_avg_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_batch_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_binary_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_celu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_channel_shuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_channel_shuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_conv1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_conv1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_conv2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_conv2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_conv3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_conv3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_conv_transpose1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_conv_transpose1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_conv_transpose2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_conv_transpose2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_conv_transpose3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_conv_transpose3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_cosine_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_cosine_similarity_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_ctc_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_dropout2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_dropout3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_dropout_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_elu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_embedding_bag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_embedding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_fractional_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_fractional_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_gaussian_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_gelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_glu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_grid_sample_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_group_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_hardshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_hardsigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_hardswish_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_hardtanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_huber_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_instance_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_interpolate_area_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_interpolate_bicubic_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_interpolate_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_interpolate_linear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_interpolate_nearest_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_interpolate_trilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_kl_div_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_l1_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_l1_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_layer_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_leaky_relu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_linear_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_linear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_local_response_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_logsigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_max_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_max_unpool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_max_unpool1d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_max_unpool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_max_unpool2d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_max_unpool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_max_unpool3d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_mish_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_mse_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_multi_head_attention_forward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_multi_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_multilabel_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_normalize_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_normalize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_circular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_circular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_constant_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_constant_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_reflect_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_reflect_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_replicate_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_replicate_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_replicate_negative_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pad_replicate_negative_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pairwise_distance_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pairwise_distance_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pdist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pixel_shuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_pixel_unshuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_prelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_relu6_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_relu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_rms_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_rms_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_rrelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_selu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_silu_complex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_silu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_softmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_softmin_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_softplus_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_softshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_softsign_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_softsign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_tanhshrink_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_tanhshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_threshold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_unfold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_upsample_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nn_functional_upsample_nearest_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nonzero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nonzero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nonzero_static_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_nonzero_static_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_norm_fro_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_norm_fro_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_norm_inf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_norm_inf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_norm_nuc_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_norm_nuc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_normal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_normal_in_place_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_normal_in_place_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_normal_number_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_ones_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_ones_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_ones_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_ones_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_ormqr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_ormqr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_outer_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_outer_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_pca_lowrank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_pca_lowrank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_permute_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_permute_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_permute_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_permute_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_pinverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_pinverse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_polar_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_polygamma_polygamma_n_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_polygamma_polygamma_n_1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_polygamma_polygamma_n_2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_polygamma_polygamma_n_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_polygamma_polygamma_n_4_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_positive_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_positive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_pow_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_pow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_put_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_put_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_qr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_qr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_quantile_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_rad2deg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_rand_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_rand_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_randint_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_randint_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_randn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_randn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_randn_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_randn_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_ravel_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_ravel_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_real_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_real_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_reciprocal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_reciprocal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_remainder_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_renorm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_renorm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_repeat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_repeat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_repeat_interleave_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_repeat_interleave_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_reshape_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_reshape_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_reshape_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_reshape_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_resize__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_resize__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_resize_as__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_resize_as__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_resolve_conj_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_resolve_conj_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_resolve_neg_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_resolve_neg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_roll_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_roll_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_rot90_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_rot90_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_round_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_round_decimals_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_round_decimals_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_round_decimals_neg_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_rsqrt_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_rsqrt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_rsub_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_rsub_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_scalar_tensor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_scalar_tensor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_scatter_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_scatter_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_scatter_reduce_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_scatter_reduce_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_scatter_reduce_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_scatter_reduce_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_scatter_reduce_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_searchsorted_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_select_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sgn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sgn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_short_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_short_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sigmoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signal_windows_bartlett_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signal_windows_blackman_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signal_windows_cosine_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signal_windows_exponential_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signal_windows_gaussian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signal_windows_general_cosine_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signal_windows_general_hamming_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signal_windows_hamming_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signal_windows_hann_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signal_windows_kaiser_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signal_windows_nuttall_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_signbit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sin_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sinc_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sinc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sinh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_slice_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_slice_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_slice_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_softmax_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sparse_mm_reduce_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sparse_sampled_addmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sparse_sampled_addmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_airy_ai_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_bessel_j0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_bessel_j1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_bessel_y0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_bessel_y1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_chebyshev_polynomial_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_chebyshev_polynomial_u_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_chebyshev_polynomial_v_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_chebyshev_polynomial_w_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_entr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_erfcx_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_hermite_polynomial_h_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_hermite_polynomial_he_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_i0e_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_i1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_i1e_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_laguerre_polynomial_l_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_legendre_polynomial_p_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_log_ndtr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_modified_bessel_i0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_modified_bessel_i1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_modified_bessel_k0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_ndtr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_ndtri_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_scaled_modified_bessel_k0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_scaled_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_spherical_bessel_j0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_xlog1py_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_special_zeta_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_split_list_args_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_split_list_args_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_split_with_sizes_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_split_with_sizes_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_split_with_sizes_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_split_with_sizes_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sqrt_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sqrt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_square_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_square_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_squeeze_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_squeeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_squeeze_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_squeeze_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_squeeze_multiple_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_squeeze_multiple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_stack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_stack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_std_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_std_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_std_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_std_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_std_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_std_mean_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_std_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_std_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_stft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_stft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sub_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sub_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sum_to_size_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_sum_to_size_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_svd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_svd_lowrank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_svd_lowrank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_t_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_t_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_t_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_take_along_dim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_take_along_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_take_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_take_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tanh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tensor_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tensor_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tensordot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tensordot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tile_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tile_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_to_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_to_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_to_sparse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_to_sparse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_topk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_trace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_trace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_transpose_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_transpose_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_transpose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_transpose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_trapezoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_trapz_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_trapz_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_triangular_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_triangular_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tril_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_tril_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_triu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_triu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_true_divide_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_true_divide_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_trunc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unbind_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unbind_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unbind_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unbind_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unflatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unflatten_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unfold_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unfold_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unfold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_uniform_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_uniform_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unique_consecutive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unique_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unsafe_chunk_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unsafe_chunk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unsafe_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unsafe_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unsqueeze_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unsqueeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unsqueeze_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_unsqueeze_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_var_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_var_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_var_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_var_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_var_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_var_mean_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_var_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_var_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_vdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_vdot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_view_as_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_view_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_view_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_view_as_real_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_view_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_view_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_view_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_view_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_vsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_vsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_vstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_vstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_where_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_where_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_xlogy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_zero__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_zero__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_zeros_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_zeros_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_zeros_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_grad_zeros_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_H_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_H_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_T_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_T_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___getitem___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___getitem___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___radd___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___radd___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___rdiv___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___rdiv___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___rmatmul___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___rmatmul___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___rmod___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___rmul___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___rmul___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___rpow___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___rpow___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___rsub___cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad___rsub___cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__batch_norm_with_update_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__chunk_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__chunk_cat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__native_batch_norm_legit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__segment_reduce_lengths_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__segment_reduce_offsets_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__softmax_backward_data_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__unsafe_masked_index_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__unsafe_masked_index_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__unsafe_masked_index_put_accumulate_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad__upsample_bilinear2d_aa_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_abs_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_abs_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_acos_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_acos_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_acosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_acosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addbmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addbmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addcdiv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addcdiv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addcmul_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addcmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addmm_decomposed_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addmm_decomposed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addmv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addmv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_addr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_alias_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_alias_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_all_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_all_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_allclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_allclose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_aminmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_angle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_angle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_any_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_any_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_arange_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_argmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_argmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_argsort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_argwhere_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_argwhere_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_as_strided_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_as_strided_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_as_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_as_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_as_strided_partial_views_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_as_strided_partial_views_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_as_strided_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_as_strided_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_asin_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_asin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_asinh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_asinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atan2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atanh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atleast_1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atleast_1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atleast_2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atleast_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atleast_3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_atleast_3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_baddbmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_baddbmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_bernoulli_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_bfloat16_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_bfloat16_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_block_diag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_block_diag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_bmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_bmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_bool_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_bool_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_broadcast_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_broadcast_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_broadcast_to_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_broadcast_to_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_bucketize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_byte_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_byte_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cartesian_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cartesian_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cauchy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cdist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cdouble_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cdouble_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ceil_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cfloat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cfloat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_chalf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_chalf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_char_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_char_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cholesky_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cholesky_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cholesky_inverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cholesky_inverse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cholesky_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cholesky_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_chunk_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_chunk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_clamp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_clamp_max_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_clamp_min_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_clone_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_clone_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_column_stack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_column_stack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_combinations_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_combinations_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_conj_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_conj_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_conj_physical_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_conj_physical_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_constant_pad_nd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_constant_pad_nd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_contiguous_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_contiguous_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_copysign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_corrcoef_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_corrcoef_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cos_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cos_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cosh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cosh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_count_nonzero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_count_nonzero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cov_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cov_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cross_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cross_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cummax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cummin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cumprod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cumprod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cumsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cumsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cumulative_trapezoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_cumulative_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_deg2rad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diag_embed_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diag_embed_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diagflat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diagflat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diagonal_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diagonal_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diagonal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diagonal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diagonal_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diagonal_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diff_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_diff_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_digamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_dist_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_dist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_div_floor_rounding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_div_no_rounding_mode_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_div_no_rounding_mode_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_div_trunc_rounding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_dot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_dot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_double_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_double_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_dsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_dsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_dstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_dstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_einsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_einsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_empty_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_empty_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_empty_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_empty_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_empty_permuted_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_empty_permuted_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_empty_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_empty_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_eq_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_eq_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_equal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_equal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_erf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_erfc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_erfinv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_exp2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_exp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_exp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_expand_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_expand_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_expand_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_expand_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_expand_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_expand_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_expm1_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_expm1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_exponential_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_eye_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_eye_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_fft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_fft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_fft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_fft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_fftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_fftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_fftshift_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_fftshift_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_hfft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_hfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_hfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_hfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_hfftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_hfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_ifft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_ifft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_ifft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_ifft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_ifftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_ifftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_ifftshift_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_ifftshift_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_ihfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_ihfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_ihfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_irfft2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_irfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_irfft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_irfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_irfftn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_irfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_rfft2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_rfft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fft_rfftn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_flatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_flatten_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_flip_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_flip_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fliplr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fliplr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_flipud_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_flipud_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_float_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_float_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_float_power_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_float_power_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_floor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_floor_divide_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_fmod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_frac_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_frexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_full_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_full_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_full_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_full_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_gather_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_gather_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ge_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_geometric_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_geqrf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_geqrf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_gradient_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_gradient_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_grid_sampler_2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_gt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_half_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_half_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_hash_tensor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_heaviside_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_histc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_hsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_hsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_hstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_hstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_hypot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_i0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_igamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_igammac_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_imag_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_put_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_put_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_reduce_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_reduce_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_reduce_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_reduce_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_index_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_inner_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_inner_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_int_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_int_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isclose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isclose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isfinite_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isfinite_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isinf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isinf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isnan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isnan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isneginf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isposinf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isreal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_isreal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_istft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_item_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_item_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_jiterator_2inputs_2outputs_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_jiterator_2inputs_2outputs_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_jiterator_4inputs_with_extra_args_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_jiterator_binary_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_jiterator_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_jiterator_binary_return_by_ref_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_jiterator_binary_return_by_ref_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_jiterator_unary_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_jiterator_unary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_kron_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_kron_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_kthvalue_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ldexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ldexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_le_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_lerp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_lerp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_lgamma_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_cholesky_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_cholesky_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_cholesky_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_cholesky_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_cond_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_cond_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_cross_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_cross_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_det_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_det_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_diagonal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_diagonal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_eig_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_eig_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_eigh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_eigh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_eigvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_eigvals_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_eigvalsh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_eigvalsh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_householder_product_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_householder_product_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_inv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_inv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_inv_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_inv_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_ldl_factor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_ldl_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_ldl_factor_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_ldl_factor_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_ldl_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_ldl_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lstsq_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lstsq_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lstsq_grad_oriented_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lstsq_grad_oriented_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lu_factor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lu_factor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lu_factor_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lu_factor_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lu_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_lu_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_matrix_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_matrix_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_matrix_power_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_matrix_power_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_matrix_rank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_matrix_rank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_matrix_rank_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_matrix_rank_hermitian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_multi_dot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_multi_dot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_norm_subgradients_at_zero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_pinv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_pinv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_pinv_hermitian_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_pinv_hermitian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_pinv_singular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_pinv_singular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_qr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_qr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_slogdet_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_slogdet_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_solve_ex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_solve_ex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_solve_triangular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_solve_triangular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_svd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_svdvals_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_svdvals_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_tensorinv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_tensorinv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_tensorsolve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_tensorsolve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_vander_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_vander_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_vecdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_vecdot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_vector_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linalg_vector_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linspace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linspace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linspace_tensor_overload_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_linspace_tensor_overload_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log10_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log10_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log1p_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log1p_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log2_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log_normal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_log_softmax_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logaddexp2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logaddexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logcumsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logcumsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logdet_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logdet_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logical_and_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logical_and_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logical_not_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logical_not_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logical_or_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logical_or_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logical_xor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logical_xor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logspace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logspace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logspace_tensor_overload_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logspace_tensor_overload_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_logsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_long_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_long_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_lt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_lu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_lu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_lu_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_lu_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_lu_unpack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_lu_unpack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mH_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mH_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mT_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mT_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_argmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_argmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_cumprod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_cumprod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_cumsum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_cumsum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_fill_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_fill_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_log_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_logaddexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_logsumexp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_logsumexp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_median_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_normalize_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_normalize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_softmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_std_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_std_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_var_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_masked_var_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_matmul_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_matmul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_matrix_exp_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_matrix_exp_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_max_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_max_pool2d_with_indices_backward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_max_reduction_no_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_max_reduction_with_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_maximum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_median_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_meshgrid_list_of_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_meshgrid_list_of_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_meshgrid_variadic_tensors_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_meshgrid_variadic_tensors_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_min_binary_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_min_reduction_no_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_min_reduction_with_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_minimum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mode_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_movedim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_movedim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_msort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mul_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mul_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_multinomial_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mv_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mv_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nan_to_num_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nanmean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nanmean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nanmedian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nanquantile_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nansum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nansum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_narrow_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_narrow_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_narrow_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_narrow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_native_batch_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_native_dropout_backward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_native_layer_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ne_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ne_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_neg_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_neg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_new_empty_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_new_empty_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_new_empty_strided_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_new_empty_strided_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_new_full_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_new_full_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_new_ones_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_new_ones_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_new_zeros_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_new_zeros_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nextafter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_alpha_dropout_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_avg_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_avg_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_avg_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_batch_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_binary_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_celu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_channel_shuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_channel_shuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv_transpose1d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv_transpose1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv_transpose2d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv_transpose2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv_transpose3d_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_conv_transpose3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_cosine_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_cosine_similarity_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_cross_entropy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_ctc_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_dropout2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_dropout3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_dropout_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_elu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_embedding_bag_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_embedding_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_fractional_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_fractional_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_gaussian_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_gelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_glu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_grid_sample_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_group_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_hardshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_hardsigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_hardswish_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_hardtanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_hinge_embedding_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_huber_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_instance_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_interpolate_area_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_interpolate_bicubic_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_interpolate_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_interpolate_linear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_interpolate_nearest_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_interpolate_trilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_kl_div_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_l1_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_l1_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_layer_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_leaky_relu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_linear_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_linear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_local_response_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_logsigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_margin_ranking_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_max_pool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_max_pool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_max_pool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_max_unpool1d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_max_unpool1d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_max_unpool2d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_max_unpool2d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_max_unpool3d_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_max_unpool3d_grad_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_mish_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_mse_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_multi_head_attention_forward_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_multi_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_multilabel_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_normalize_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_normalize_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pad_circular_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pad_circular_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pad_constant_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pad_constant_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pad_reflect_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pad_reflect_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pad_replicate_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pad_replicate_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pad_replicate_negative_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pad_replicate_negative_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pairwise_distance_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pairwise_distance_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pdist_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pixel_shuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_pixel_unshuffle_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_prelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_relu6_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_relu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_rms_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_rms_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_rrelu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_selu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_silu_complex_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_silu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_soft_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_softmin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_softmin_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_softplus_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_softshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_softsign_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_softsign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_tanhshrink_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_tanhshrink_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_threshold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_triplet_margin_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_triplet_margin_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_unfold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_upsample_bilinear_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nn_functional_upsample_nearest_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nonzero_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nonzero_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nonzero_static_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_nonzero_static_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_norm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_norm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_norm_fro_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_norm_fro_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_norm_inf_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_norm_inf_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_norm_nuc_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_norm_nuc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_normal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_normal_in_place_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_normal_in_place_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_normal_number_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ones_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ones_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ones_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ones_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ormqr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ormqr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_outer_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_outer_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_pca_lowrank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_pca_lowrank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_permute_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_permute_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_permute_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_permute_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_pinverse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_pinverse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_polar_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_polygamma_polygamma_n_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_polygamma_polygamma_n_1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_polygamma_polygamma_n_2_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_polygamma_polygamma_n_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_polygamma_polygamma_n_4_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_positive_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_positive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_pow_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_pow_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_prod_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_put_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_put_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_qr_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_qr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_quantile_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_rad2deg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_rand_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_rand_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_randint_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_randint_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_randn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_randn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_randn_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_randn_like_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ravel_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_ravel_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_real_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_real_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_reciprocal_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_reciprocal_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_remainder_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_renorm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_renorm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_repeat_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_repeat_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_repeat_interleave_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_repeat_interleave_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_reshape_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_reshape_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_reshape_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_reshape_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_resize__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_resize__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_resize_as__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_resize_as__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_resolve_conj_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_resolve_conj_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_resolve_neg_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_resolve_neg_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_roll_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_roll_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_rot90_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_rot90_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_round_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_round_decimals_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_round_decimals_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_round_decimals_neg_3_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_rsqrt_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_rsqrt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_rsub_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_rsub_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scalar_tensor_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scalar_tensor_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scatter_add_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scatter_add_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scatter_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scatter_reduce_amax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scatter_reduce_amin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scatter_reduce_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scatter_reduce_prod_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_scatter_reduce_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_searchsorted_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_select_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_select_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_select_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sgn_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sgn_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_short_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_short_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sigmoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sigmoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sign_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_signal_windows_bartlett_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_signal_windows_blackman_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_signal_windows_cosine_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_signal_windows_exponential_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_signal_windows_gaussian_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_signal_windows_general_cosine_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_signal_windows_general_hamming_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_signal_windows_hamming_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_signal_windows_hann_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_signal_windows_kaiser_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_signal_windows_nuttall_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_signbit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sin_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sin_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sinc_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sinc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sinh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sinh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_slice_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_slice_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_slice_scatter_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_softmax_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_softmax_with_dtype_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_softmax_with_dtype_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sort_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sparse_mm_reduce_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sparse_sampled_addmm_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sparse_sampled_addmm_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_airy_ai_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_bessel_j0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_bessel_j1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_bessel_y0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_bessel_y1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_chebyshev_polynomial_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_chebyshev_polynomial_u_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_chebyshev_polynomial_v_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_chebyshev_polynomial_w_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_entr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_erfcx_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_hermite_polynomial_h_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_hermite_polynomial_he_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_i0e_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_i1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_i1e_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_laguerre_polynomial_l_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_legendre_polynomial_p_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_log_ndtr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_modified_bessel_i0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_modified_bessel_i1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_modified_bessel_k0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_ndtr_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_ndtri_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_scaled_modified_bessel_k0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_scaled_modified_bessel_k1_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_spherical_bessel_j0_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_xlog1py_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_special_zeta_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_split_list_args_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_split_list_args_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_split_with_sizes_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_split_with_sizes_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_split_with_sizes_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_split_with_sizes_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sqrt_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sqrt_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_square_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_square_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_squeeze_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_squeeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_squeeze_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_squeeze_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_squeeze_multiple_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_squeeze_multiple_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_stack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_stack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_std_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_std_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_std_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_std_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_std_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_std_mean_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_std_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_std_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_stft_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_stft_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sub_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sub_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sum_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sum_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sum_to_size_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_sum_to_size_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_svd_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_svd_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_svd_lowrank_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_svd_lowrank_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_t_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_t_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_t_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_t_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_take_along_dim_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_take_along_dim_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_take_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_take_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_tan_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_tan_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_tanh_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_tanh_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_tensor_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_tensor_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_tensordot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_tensordot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_tile_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_tile_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_to_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_to_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_to_sparse_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_to_sparse_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_topk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_trace_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_trace_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_transpose_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_transpose_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_transpose_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_transpose_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_trapezoid_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_trapezoid_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_trapz_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_trapz_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_triangular_solve_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_triangular_solve_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_tril_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_tril_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_triu_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_triu_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_true_divide_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_true_divide_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_trunc_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unbind_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unbind_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unbind_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unbind_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unflatten_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unflatten_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unfold_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unfold_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unfold_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unfold_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_uniform_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_uniform_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unique_consecutive_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unique_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unsafe_chunk_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unsafe_chunk_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unsafe_split_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unsafe_split_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unsqueeze_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unsqueeze_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unsqueeze_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_unsqueeze_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_var_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_var_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_var_mean_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_var_mean_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_var_mean_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_var_mean_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_var_unbiased_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_var_unbiased_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_vdot_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_vdot_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_view_as_complex_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_view_as_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_view_as_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_view_as_real_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_view_copy_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_view_copy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_view_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_view_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_vsplit_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_vsplit_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_vstack_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_vstack_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_where_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_where_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_xlogy_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_zero__cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_zero__cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_zeros_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_zeros_cuda_float64, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_zeros_like_cuda_complex128, test/test_ops_gradients.py::TestBwdGradientsCUDA::test_inplace_gradgrad_zeros_like_cuda_float64 2025-08-14T22:46:55.9879353Z 2025-08-14T22:46:57.5344606Z 2025-08-14T22:46:57.5345992Z profiler/test_cpp_thread 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_cpp_thread_1.1_b9ccb5213e60c531_.log 2025-08-14T22:46:57.5350802Z Running 6 items in this shard: test/profiler/test_cpp_thread.py::CppThreadTestCUDA::test_profile_memory_cuda, test/profiler/test_cpp_thread.py::CppThreadTestCUDA::test_with_enable_profiler_in_child_thread_cuda, test/profiler/test_cpp_thread.py::CppThreadTestCUDA::test_without_enable_profiler_in_child_thread_cuda, test/profiler/test_cpp_thread.py::CppThreadTestXPU::test_profile_memory_xpu, test/profiler/test_cpp_thread.py::CppThreadTestXPU::test_with_enable_profiler_in_child_thread_xpu, test/profiler/test_cpp_thread.py::CppThreadTestXPU::test_without_enable_profiler_in_child_thread_xpu 2025-08-14T22:46:57.5354280Z 2025-08-14T22:46:58.6106131Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:46:58.6107463Z import pkg_resources 2025-08-14T22:46:59.0093014Z Running test_subclass 1/1 ... [2025-08-14 22:46:59.008775] 2025-08-14T22:46:59.0093555Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:46:59.0097286Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_subclass.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:46:59.009379] 2025-08-14T22:46:59.8037011Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:46:59.8038340Z import pkg_resources 2025-08-14T22:47:00.1845022Z Running profiler/test_profiler 1/1 ... [2025-08-14 22:47:00.183879] 2025-08-14T22:47:00.1845471Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:47:00.1846498Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_profiler.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:47:00.184270] 2025-08-14T22:47:01.5775187Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:47:01.5776571Z import pkg_resources 2025-08-14T22:47:01.9520201Z Running functorch/test_vmap_registrations 1/1 ... [2025-08-14 22:47:01.951488] 2025-08-14T22:47:01.9520829Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:47:01.9524734Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_vmap_registrations.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:47:01.951878] 2025-08-14T22:47:03.8836657Z 2025-08-14T22:47:03.8837627Z test_subclass 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_subclass_1.1_09bfb567f3230108_.log 2025-08-14T22:47:03.8870088Z Running 100 items in this shard: test/test_subclass.py::TestSubclass::test_deepcopy_base_tensor_as_param_False, test/test_subclass.py::TestSubclass::test_deepcopy_base_tensor_as_param_True, test/test_subclass.py::TestSubclass::test_deepcopy_diag_tensor_below_as_param_False, test/test_subclass.py::TestSubclass::test_deepcopy_diag_tensor_below_as_param_True, test/test_subclass.py::TestSubclass::test_deepcopy_logging_tensor_as_param_False, test/test_subclass.py::TestSubclass::test_deepcopy_logging_tensor_as_param_True, test/test_subclass.py::TestSubclass::test_deepcopy_non_wrapper_tensor_as_param_False, test/test_subclass.py::TestSubclass::test_deepcopy_non_wrapper_tensor_as_param_True, test/test_subclass.py::TestSubclass::test_deepcopy_sparse_tensor_as_param_False, test/test_subclass.py::TestSubclass::test_deepcopy_sparse_tensor_as_param_True, test/test_subclass.py::TestSubclass::test_deepcopy_wrapper_with_custom_sizes_as_param_False, test/test_subclass.py::TestSubclass::test_deepcopy_wrapper_with_custom_sizes_as_param_True, test/test_subclass.py::TestSubclass::test_deepcopy_wrapper_with_custom_strides_as_param_False, test/test_subclass.py::TestSubclass::test_deepcopy_wrapper_with_custom_strides_as_param_True, test/test_subclass.py::TestSubclass::test_lazy_module_base_tensor, test/test_subclass.py::TestSubclass::test_lazy_module_diag_tensor_below, test/test_subclass.py::TestSubclass::test_lazy_module_logging_tensor, test/test_subclass.py::TestSubclass::test_lazy_module_non_wrapper_tensor, test/test_subclass.py::TestSubclass::test_lazy_module_sparse_tensor, test/test_subclass.py::TestSubclass::test_lazy_module_wrapper_with_custom_sizes, test/test_subclass.py::TestSubclass::test_lazy_module_wrapper_with_custom_strides, test/test_subclass.py::TestSubclass::test_module_optimization_base_tensor, test/test_subclass.py::TestSubclass::test_module_optimization_diag_tensor_below, test/test_subclass.py::TestSubclass::test_module_optimization_logging_tensor, test/test_subclass.py::TestSubclass::test_module_optimization_non_wrapper_tensor, test/test_subclass.py::TestSubclass::test_module_optimization_sparse_tensor, test/test_subclass.py::TestSubclass::test_module_optimization_wrapper_with_custom_sizes, test/test_subclass.py::TestSubclass::test_module_optimization_wrapper_with_custom_strides, test/test_subclass.py::TestSubclass::test_non_rewrapping_torch_dispatch_subclass_as_parameter_throws_for_detach, test/test_subclass.py::TestSubclass::test_param_invariants_base_tensor_tensor_requires_grad_False, test/test_subclass.py::TestSubclass::test_param_invariants_base_tensor_tensor_requires_grad_True, test/test_subclass.py::TestSubclass::test_param_invariants_diag_tensor_below_tensor_requires_grad_False, test/test_subclass.py::TestSubclass::test_param_invariants_diag_tensor_below_tensor_requires_grad_True, test/test_subclass.py::TestSubclass::test_param_invariants_logging_tensor_tensor_requires_grad_False, test/test_subclass.py::TestSubclass::test_param_invariants_logging_tensor_tensor_requires_grad_True, test/test_subclass.py::TestSubclass::test_param_invariants_non_wrapper_tensor_tensor_requires_grad_False, test/test_subclass.py::TestSubclass::test_param_invariants_non_wrapper_tensor_tensor_requires_grad_True, test/test_subclass.py::TestSubclass::test_param_invariants_sparse_tensor_tensor_requires_grad_False, test/test_subclass.py::TestSubclass::test_param_invariants_sparse_tensor_tensor_requires_grad_True, test/test_subclass.py::TestSubclass::test_param_invariants_wrapper_with_custom_sizes_tensor_requires_grad_False, test/test_subclass.py::TestSubclass::test_param_invariants_wrapper_with_custom_sizes_tensor_requires_grad_True, test/test_subclass.py::TestSubclass::test_param_invariants_wrapper_with_custom_strides_tensor_requires_grad_False, test/test_subclass.py::TestSubclass::test_param_invariants_wrapper_with_custom_strides_tensor_requires_grad_True, test/test_subclass.py::TestSubclass::test_parametrization_base_tensor_leave_parametrized_False, test/test_subclass.py::TestSubclass::test_parametrization_base_tensor_leave_parametrized_True, test/test_subclass.py::TestSubclass::test_parametrization_diag_tensor_below_leave_parametrized_False, test/test_subclass.py::TestSubclass::test_parametrization_diag_tensor_below_leave_parametrized_True, test/test_subclass.py::TestSubclass::test_parametrization_logging_tensor_leave_parametrized_False, test/test_subclass.py::TestSubclass::test_parametrization_logging_tensor_leave_parametrized_True, test/test_subclass.py::TestSubclass::test_parametrization_non_wrapper_tensor_leave_parametrized_False, test/test_subclass.py::TestSubclass::test_parametrization_non_wrapper_tensor_leave_parametrized_True, test/test_subclass.py::TestSubclass::test_parametrization_sparse_tensor_leave_parametrized_False, test/test_subclass.py::TestSubclass::test_parametrization_sparse_tensor_leave_parametrized_True, test/test_subclass.py::TestSubclass::test_parametrization_wrapper_with_custom_sizes_leave_parametrized_False, test/test_subclass.py::TestSubclass::test_parametrization_wrapper_with_custom_sizes_leave_parametrized_True, test/test_subclass.py::TestSubclass::test_parametrization_wrapper_with_custom_strides_leave_parametrized_False, test/test_subclass.py::TestSubclass::test_parametrization_wrapper_with_custom_strides_leave_parametrized_True, test/test_subclass.py::TestSubclass::test_repr_base_tensor_as_param_False, test/test_subclass.py::TestSubclass::test_repr_base_tensor_as_param_True, test/test_subclass.py::TestSubclass::test_repr_diag_tensor_below_as_param_False, test/test_subclass.py::TestSubclass::test_repr_diag_tensor_below_as_param_True, test/test_subclass.py::TestSubclass::test_repr_logging_tensor_as_param_False, test/test_subclass.py::TestSubclass::test_repr_logging_tensor_as_param_True, test/test_subclass.py::TestSubclass::test_repr_non_wrapper_tensor_as_param_False, test/test_subclass.py::TestSubclass::test_repr_non_wrapper_tensor_as_param_True, test/test_subclass.py::TestSubclass::test_repr_sparse_tensor_as_param_False, test/test_subclass.py::TestSubclass::test_repr_sparse_tensor_as_param_True, test/test_subclass.py::TestSubclass::test_repr_wrapper_with_custom_sizes_as_param_False, test/test_subclass.py::TestSubclass::test_repr_wrapper_with_custom_sizes_as_param_True, test/test_subclass.py::TestSubclass::test_repr_wrapper_with_custom_strides_as_param_False, test/test_subclass.py::TestSubclass::test_repr_wrapper_with_custom_strides_as_param_True, test/test_subclass.py::TestSubclass::test_serialization_base_tensor_as_param_False, test/test_subclass.py::TestSubclass::test_serialization_base_tensor_as_param_True, test/test_subclass.py::TestSubclass::test_serialization_diag_tensor_below_as_param_False, test/test_subclass.py::TestSubclass::test_serialization_diag_tensor_below_as_param_True, test/test_subclass.py::TestSubclass::test_serialization_logging_tensor_as_param_False, test/test_subclass.py::TestSubclass::test_serialization_logging_tensor_as_param_True, test/test_subclass.py::TestSubclass::test_serialization_non_wrapper_tensor_as_param_False, test/test_subclass.py::TestSubclass::test_serialization_non_wrapper_tensor_as_param_True, test/test_subclass.py::TestSubclass::test_serialization_sparse_tensor_as_param_False, test/test_subclass.py::TestSubclass::test_serialization_sparse_tensor_as_param_True, test/test_subclass.py::TestSubclass::test_serialization_wrapper_with_custom_sizes_as_param_False, test/test_subclass.py::TestSubclass::test_serialization_wrapper_with_custom_sizes_as_param_True, test/test_subclass.py::TestSubclass::test_serialization_wrapper_with_custom_strides_as_param_False, test/test_subclass.py::TestSubclass::test_serialization_wrapper_with_custom_strides_as_param_True, test/test_subclass.py::TestSubclass::test_tensor_subclass_storage_data_accesses_throw, test/test_subclass.py::TestSubclass::test_type_propagation_base_tensor_as_param_False, test/test_subclass.py::TestSubclass::test_type_propagation_base_tensor_as_param_True, test/test_subclass.py::TestSubclass::test_type_propagation_diag_tensor_below_as_param_False, test/test_subclass.py::TestSubclass::test_type_propagation_diag_tensor_below_as_param_True, test/test_subclass.py::TestSubclass::test_type_propagation_logging_tensor_as_param_False, test/test_subclass.py::TestSubclass::test_type_propagation_logging_tensor_as_param_True, test/test_subclass.py::TestSubclass::test_type_propagation_non_wrapper_tensor_as_param_False, test/test_subclass.py::TestSubclass::test_type_propagation_non_wrapper_tensor_as_param_True, test/test_subclass.py::TestSubclass::test_type_propagation_sparse_tensor_as_param_False, test/test_subclass.py::TestSubclass::test_type_propagation_sparse_tensor_as_param_True, test/test_subclass.py::TestSubclass::test_type_propagation_wrapper_with_custom_sizes_as_param_False, test/test_subclass.py::TestSubclass::test_type_propagation_wrapper_with_custom_sizes_as_param_True, test/test_subclass.py::TestSubclass::test_type_propagation_wrapper_with_custom_strides_as_param_False, test/test_subclass.py::TestSubclass::test_type_propagation_wrapper_with_custom_strides_as_param_True 2025-08-14T22:47:03.8900750Z 2025-08-14T22:47:04.9086919Z 2025-08-14T22:47:04.9088250Z profiler/test_profiler 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_profiler_1.1_6efd2c1406cb25f8_.log 2025-08-14T22:47:04.9122605Z Running 70 items in this shard: test/profiler/test_profiler.py::TestProfilerCUDA::test_cudagraph_profiling_workaround, test/profiler/test_profiler.py::TestProfilerCUDA::test_custom_module_input_op_ids, test/profiler/test_profiler.py::TestProfilerCUDA::test_mem_leak, test/profiler/test_profiler.py::TestProfilerITT::test_custom_module_input_op_ids, test/profiler/test_profiler.py::TestProfiler::test_basic_chrome_trace, test/profiler/test_profiler.py::TestProfiler::test_basic_profile, test/profiler/test_profiler.py::TestProfiler::test_concrete_inputs_profiling, test/profiler/test_profiler.py::TestProfiler::test_concrete_inputs_profiling_toggling, test/profiler/test_profiler.py::TestProfiler::test_cpu_annotation_overlap, test/profiler/test_profiler.py::TestProfiler::test_disable_external_correlation, test/profiler/test_profiler.py::TestProfiler::test_dynamic_toggle, test/profiler/test_profiler.py::TestProfiler::test_event_list, test/profiler/test_profiler.py::TestProfiler::test_export_stacks, test/profiler/test_profiler.py::TestProfiler::test_flops, test/profiler/test_profiler.py::TestProfiler::test_forked_process, test/profiler/test_profiler.py::TestProfiler::test_guarded_record_function_fast, test/profiler/test_profiler.py::TestProfiler::test_high_level_trace, test/profiler/test_profiler.py::TestProfiler::test_is_profiler_enabled, test/profiler/test_profiler.py::TestProfiler::test_kineto, test/profiler/test_profiler.py::TestProfiler::test_kineto_multigpu, test/profiler/test_profiler.py::TestProfiler::test_kineto_profiler_api, test/profiler/test_profiler.py::TestProfiler::test_kineto_profiler_multiple_steppers, test/profiler/test_profiler.py::TestProfiler::test_kineto_profiler_with_environment_variable, test/profiler/test_profiler.py::TestProfiler::test_lazy_build_tree, test/profiler/test_profiler.py::TestProfiler::test_memory_profiler, test/profiler/test_profiler.py::TestProfiler::test_module_hierarchy, test/profiler/test_profiler.py::TestProfiler::test_nested_tensor_with_shapes, test/profiler/test_profiler.py::TestProfiler::test_oom_tracing, test/profiler/test_profiler.py::TestProfiler::test_override_time_units, test/profiler/test_profiler.py::TestProfiler::test_profile_all_threads, test/profiler/test_profiler.py::TestProfiler::test_profiler_correlation_id, test/profiler/test_profiler.py::TestProfiler::test_profiler_cuda_sync_events, test/profiler/test_profiler.py::TestProfiler::test_profiler_disable_fwd_bwd_link, test/profiler/test_profiler.py::TestProfiler::test_profiler_fwd_bwd_link, test/profiler/test_profiler.py::TestProfiler::test_profiler_metadata, test/profiler/test_profiler.py::TestProfiler::test_profiler_op_event_args, test/profiler/test_profiler.py::TestProfiler::test_profiler_op_event_kwargs, test/profiler/test_profiler.py::TestProfiler::test_profiler_strides, test/profiler/test_profiler.py::TestProfiler::test_profiler_time_scale, test/profiler/test_profiler.py::TestProfiler::test_profiler_tracing, test/profiler/test_profiler.py::TestProfiler::test_profiler_type, test/profiler/test_profiler.py::TestProfiler::test_record_function_fast, test/profiler/test_profiler.py::TestProfiler::test_schedule_function_count, test/profiler/test_profiler.py::TestProfiler::test_skip_first_wait, test/profiler/test_profiler.py::TestProfiler::test_source, test/profiler/test_profiler.py::TestProfiler::test_tensorboard_trace_handler, test/profiler/test_profiler.py::TestProfiler::test_user_annotation, test/profiler/test_profiler.py::TestExperimentalUtils::test_bfs, test/profiler/test_profiler.py::TestExperimentalUtils::test_dfs, test/profiler/test_profiler.py::TestExperimentalUtils::test_fuzz_symbolize, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_conv2d_bias_followed_by_batchnorm2d_pattern, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_debug_autotuner, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_extra_cuda_copy_pattern, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_extra_cuda_copy_pattern_benchmark, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_for_loop_indexing_pattern, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_fp32_matmul_pattern, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_grad_not_set_to_none_pattern, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_matmul_dim_fp16_pattern, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_name_pattern, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_optimizer_single_tensor_pattern, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_overload_names, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_pattern_match_helper, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_pattern_matcher_json_report, test/profiler/test_profiler.py::TestExperimentalUtils::test_profiler_synchronized_dataloader_pattern, test/profiler/test_profiler.py::TestExperimentalUtils::test_utils_compute_idle_time, test/profiler/test_profiler.py::TestExperimentalUtils::test_utils_compute_queue_depth, test/profiler/test_profiler.py::TestExperimentalUtils::test_utils_compute_queue_depth_when_no_cuda_events, test/profiler/test_profiler.py::TestExperimentalUtils::test_utils_compute_self_time, test/profiler/test_profiler.py::TestExperimentalUtils::test_utils_get_optimizable_events, test/profiler/test_profiler.py::TestExperimentalUtils::test_utils_intervals_overlap 2025-08-14T22:47:04.9154656Z 2025-08-14T22:47:08.1149732Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:47:08.1151058Z import pkg_resources 2025-08-14T22:47:08.5081592Z Running nn/test_parametrization 1/1 ... [2025-08-14 22:47:08.507663] 2025-08-14T22:47:08.5082026Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:47:08.5084749Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'nn/test_parametrization.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:47:08.508093] 2025-08-14T22:47:08.9951222Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:47:08.9952563Z import pkg_resources 2025-08-14T22:47:09.3879988Z Running nn/test_pruning 1/1 ... [2025-08-14 22:47:09.387458] 2025-08-14T22:47:09.3880398Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:47:09.3882560Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'nn/test_pruning.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:47:09.387836] 2025-08-14T22:47:09.8835441Z 2025-08-14T22:47:09.8836247Z functorch/test_vmap_registrations 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_vmap_registrations_1.1_7566e4ce4b5bbc84_.log 2025-08-14T22:47:10.0051581Z Running 1719 items in this shard: test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[_test::cat], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[_test::get_first], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[_test::leaky_relu], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::__and__.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::__and__.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::__iand__.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::__iand__.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::__ior__.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::__ior__.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::__ixor__.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::__ixor__.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::__or__.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::__or__.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::__xor__.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::__xor__.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_add_batch_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_autocast_to_full_precision], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_autocast_to_reduced_precision], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_batch_norm_impl_index], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_batch_norm_impl_index_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_cast_Byte], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_cast_Char], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_cast_Double], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_cast_Float], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_cast_Half], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_cast_Int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_cast_Long], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_cast_Short], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_choose_qparams_per_tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_convolution.deprecated], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_convolution_double_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_convolution_mode], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_cufft_clear_plan_cache], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_cufft_get_plan_cache_max_size], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_cufft_get_plan_cache_size], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_cufft_set_plan_cache_max_size], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_debug_has_internal_overlap], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_dim_arange], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_embedding_bag_sparse_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_fused_rms_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_gather_sparse_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_grid_sampler_2d_cpu_fallback_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_has_compatible_shallow_copy_type], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_is_zerotensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_lu_with_info], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_nnpack_available], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_pack_padded_sequence_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_pad_circular], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_pad_enum], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_pad_packed_sequence], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_propagate_xla_data], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_remove_batch_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_reshape_from_tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_rowwise_prune], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_saturate_weight_to_fp16], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_scaled_dot_product_attention_math], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_shape_as_tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_sobol_engine_draw], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_sobol_engine_ff_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_sobol_engine_initialize_state_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_sobol_engine_scramble_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_sparse_bsc_tensor_unsafe], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_sparse_bsr_tensor_unsafe], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_sparse_compressed_tensor_unsafe], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_sparse_coo_tensor_unsafe], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_sparse_csc_tensor_unsafe], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_sparse_csr_tensor_unsafe], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_sparse_log_softmax.Dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_sparse_log_softmax.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_sparse_mm.reduce], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_sparse_mm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_sparse_softmax.Dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_sparse_softmax.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_sparse_sum.dim_dtype], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_sparse_sum.dtype], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_sparse_sum], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_test_ambiguous_defaults.a], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_test_ambiguous_defaults.b], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_test_autograd_multiple_dispatch.ntonly], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_test_check_tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_test_serialization_subcmul], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_test_string_default], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_thnn_differentiable_gru_cell_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_thnn_differentiable_lstm_cell_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_thnn_fused_lstm_cell_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_to_cpu], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_unpack_dual], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_upsample_bicubic2d_aa.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_upsample_bilinear2d_aa.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_upsample_nearest_exact1d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_upsample_nearest_exact2d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_upsample_nearest_exact3d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_use_cudnn_rnn_flatten_weight], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_validate_sparse_bsc_tensor_args], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_validate_sparse_bsr_tensor_args], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_validate_sparse_compressed_tensor_args], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_validate_sparse_coo_tensor_args], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_validate_sparse_csc_tensor_args], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_validate_sparse_csr_tensor_args], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_version], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_weight_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_weight_norm_differentiable_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_wrapped_linear_prepack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::_wrapped_quantized_linear_prepacked], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::absolute.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::absolute], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::absolute_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::adaptive_avg_pool1d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::adaptive_avg_pool2d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::adaptive_avg_pool3d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::adaptive_max_pool1d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::adjoint], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::affine_grid_generator_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::align_as], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::align_tensors], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::align_to.ellipsis_idx], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::align_to], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::all.dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::all.dimname_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::alpha_dropout], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::alpha_dropout_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::any.dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::any.dimname_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::arccos.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::arccos], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::arccos_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::arccosh.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::arccosh], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::arccosh_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::arcsin.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::arcsin], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::arcsin_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::arcsinh.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::arcsinh], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::arcsinh_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::arctan.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::arctan2.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::arctan2], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::arctan2_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::arctan], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::arctan_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::arctanh.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::arctanh], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::arctanh_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::argsort.dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::argsort.stable], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::argsort.stable_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::argsort], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::argwhere], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::atleast_1d.Sequence], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::atleast_1d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::atleast_2d.Sequence], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::atleast_2d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::atleast_3d.Sequence], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::atleast_3d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::avg_pool1d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::batch_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::bilinear], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::broadcast_tensors], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::broadcast_to], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::can_cast], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cartesian_prod], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cat.names], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cat.names_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cdist], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::chain_matmul.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::chain_matmul], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::chalf], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::choose_qparams_optimized], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::chunk], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::clip.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::clip.Tensor_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::clip.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::clip], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::clip_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::clip_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::coalesce], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::column_stack.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::column_stack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::combinations], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::concat.names], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::concat.names_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::concat.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::concat], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::concatenate.names], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::concatenate.names_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::concatenate.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::concatenate], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::conj], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::conj_physical], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::contiguous], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::conv1d.padding], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::conv1d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::conv2d.padding], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::conv2d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::conv3d.padding], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::conv3d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::conv_tbc_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::conv_transpose1d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::conv_transpose2d.input], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::conv_transpose3d.input], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::corrcoef], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cosine_embedding_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cosine_similarity], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cov], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cross.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cross], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cross_entropy_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::ctc_loss.IntList], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::ctc_loss.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cudnn_is_acceptable], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cummax.dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cummax.dimname_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cummaxmin_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cummin.dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cummin.dimname_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cumprod.dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cumprod.dimname_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cumprod_.dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cumprod_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cumsum.dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cumsum.dimname_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cumsum_.dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cumulative_trapezoid.dx], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::cumulative_trapezoid.x], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::data], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::det], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::diag.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::diag], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::diagflat], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::diagonal.Dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::diff.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::diff], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::divide.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::divide.Scalar_mode], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::divide.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::divide.Tensor_mode], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::divide.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::divide.out_mode], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::divide_.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::divide_.Scalar_mode], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::divide_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::divide_.Tensor_mode], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::dropout], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::dropout_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::dsplit.array], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::dsplit.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::dstack.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::dstack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::einsum], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::embedding_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::embedding_bag.padding_idx], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::embedding_bag], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::embedding_sparse_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::empty.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::expand_as], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fake_quantize_per_channel_affine], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fake_quantize_per_channel_affine_cachemask_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fake_quantize_per_tensor_affine.tensor_qparams], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fake_quantize_per_tensor_affine], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fake_quantize_per_tensor_affine_cachemask_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fbgemm_linear_fp16_weight], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fbgemm_linear_fp16_weight_fp32_activation], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fbgemm_linear_int8_weight], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fbgemm_linear_int8_weight_fp32_activation], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fbgemm_linear_quantize_weight], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fbgemm_pack_gemm_matrix_fp16], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fbgemm_pack_quantized_matrix.KN], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fbgemm_pack_quantized_matrix], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::feature_alpha_dropout], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::feature_alpha_dropout_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::feature_dropout], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::feature_dropout_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_fft.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_fft2.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_fft2], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_fft], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_fftn.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_fftn], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_fftshift], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_hfft.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_hfft2.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_hfft2], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_hfft], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_hfftn.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_hfftn], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_ifft.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_ifft2.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_ifft2], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_ifft], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_ifftn.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_ifftn], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_ifftshift], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_ihfft.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_ihfft2.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_ihfft2], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_ihfft], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_ihfftn.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_ihfftn], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_irfft.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_irfft2.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_irfft2], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_irfft], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_irfftn.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_irfftn], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_rfft.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_rfft2.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_rfft2], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_rfft], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_rfftn.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fft_rfftn], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fill_diagonal_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fix.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fix], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fix_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::flatten.DimnameList], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::flatten.named_out_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::flatten.using_ints], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::flatten.using_names], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::flatten_dense_tensors], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fliplr], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::flipud], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::float_power.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::float_power.Scalar_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::float_power.Tensor_Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::float_power.Tensor_Scalar_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::float_power.Tensor_Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::float_power.Tensor_Tensor_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::float_power_.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::float_power_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::frobenius_norm.dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::frobenius_norm.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::fused_moving_avg_obs_fake_quant], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::gather.dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::gather.dimname_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::gather_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::ger.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::ger], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::get_gradients], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::gradient.array], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::gradient.scalararray], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::gradient.scalarint], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::gradient.scalarrayarray], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::gradient.scalarrayint], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::gradient.tensorarray], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::gradient.tensorarrayint], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::greater.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::greater.Scalar_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::greater.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::greater.Tensor_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::greater_.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::greater_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::greater_equal.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::greater_equal.Scalar_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::greater_equal.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::greater_equal.Tensor_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::greater_equal_.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::greater_equal_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::grid_sampler], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::group_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::gru.data], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::gru.input], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::gru_cell], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::hinge_embedding_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::histogramdd.TensorList_bins], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::histogramdd.int_bins], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::histogramdd], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::hsplit.array], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::hsplit.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::hstack.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::hstack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::imag], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::index_add.dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::index_copy.dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::index_copy_.dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::index_fill.Dimname_Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::index_fill.Dimname_Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::index_fill_.Dimname_Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::index_fill_.Dimname_Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::index_select.dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::index_select.dimname_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::index_select_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::infinitely_differentiable_gelu_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::inner.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::inner], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::instance_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::inverse.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::inverse], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::is_complex], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::is_conj], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::is_distributed], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::is_floating_point], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::is_inference], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::is_leaf], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::is_neg], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::is_nonzero], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::is_signed], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::is_vulkan_available], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::isclose], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::isfinite], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::isreal], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::istft], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::item], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::kl_div], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::kron.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::kron], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::kthvalue.dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::kthvalue.dimname_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::l1_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::layer_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::ldexp.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::ldexp.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::ldexp_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::less.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::less.Scalar_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::less.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::less.Tensor_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::less_.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::less_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::less_equal.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::less_equal.Scalar_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::less_equal.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::less_equal.Tensor_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::less_equal_.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::less_equal_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_cholesky.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_cholesky], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_cond.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_cond.p_str], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_cond.p_str_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_cond], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_det.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_det], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_diagonal], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_eigh.eigvals], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_eigh], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_eigvals], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_eigvalsh.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_eigvalsh], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_inv.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_inv], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_ldl_factor.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_ldl_factor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_lu_factor.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_lu_factor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_matmul.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_matmul], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_matrix_norm.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_matrix_norm.str_ord], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_matrix_norm.str_ord_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_matrix_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_matrix_power.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_matrix_power], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_matrix_rank.atol_rtol_float], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_matrix_rank.atol_rtol_float_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_matrix_rank.atol_rtol_tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_matrix_rank.atol_rtol_tensor_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_matrix_rank.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_matrix_rank.out_tol_tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_matrix_rank.tol_tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_matrix_rank], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_multi_dot.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_multi_dot], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_norm.ord_str], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_norm.ord_str_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_norm.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_pinv.atol_rtol_float], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_pinv.atol_rtol_float_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_pinv.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_pinv.out_rcond_tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_pinv.rcond_tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_pinv], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_slogdet.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_slogdet], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_solve.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_solve], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_solve_ex.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_solve_ex], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_svd.U], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_svd], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_svdvals.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_svdvals], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_tensorinv.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_tensorinv], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_tensorsolve.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_tensorsolve], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_vander], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_vecdot.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linalg_vecdot], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::linear], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::log_sigmoid.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::log_sigmoid], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::log_softmax.Dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::log_softmax.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::logcumsumexp.dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::logcumsumexp.dimname_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::logdet], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::logsumexp.names], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::logsumexp.names_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::lstm.data], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::lstm.input], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::lstm_cell], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::lu_solve.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::lu_solve], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::mH], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::mT], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::margin_ranking_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::masked_select_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::matmul.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::matmul], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::matrix_H], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::matrix_exp], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::matrix_exp_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::matrix_power.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::matrix_power], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::max.names_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::max.names_dim_max], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::max.other], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::max.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::max_pool1d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::max_pool1d_with_indices], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::max_pool2d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::max_pool3d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::mean.names_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::mean.names_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::median.names_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::median.names_dim_values], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::meshgrid.indexing], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::meshgrid], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::min.names_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::min.names_dim_min], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::min.other], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::min.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::mish_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::mode.dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::mode.dimname_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::moveaxis.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::moveaxis.intlist], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::movedim.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::movedim.intlist], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::msort.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::msort], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::multilabel_margin_loss.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::multilabel_margin_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::multiply.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::multiply.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::multiply.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::multiply_.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::multiply_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::nanmean.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::nanmean], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::nanmedian.names_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::nanmedian.names_dim_values], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::nanquantile.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::nanquantile.scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::nanquantile.scalar_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::nanquantile], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::narrow.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::narrow], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::native_channel_shuffle], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::negative.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::negative], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::negative_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::nested_to_padded_tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::nll_loss.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::nll_loss2d.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::nll_loss2d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::nll_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::nll_loss_nd], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::nonzero_numpy], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::norm.names_ScalarOpt_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::norm.names_ScalarOpt_dim_dtype], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::norm.names_dtype_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::norm.names_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::norm_except_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::not_equal.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::not_equal.Scalar_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::not_equal.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::not_equal.Tensor_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::not_equal_.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::not_equal_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::nuclear_norm.dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::nuclear_norm.dim_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::nuclear_norm.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::nuclear_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::numpy_T], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::one_hot], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::orgqr.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::orgqr], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::outer.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::outer], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::output_nr], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::pad], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::pad_sequence], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::pairwise_distance], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::pdist], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::pin_memory], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::pinverse], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::poisson_nll_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::positive], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::prelu], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::prod.Dimname_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::prod.dim_Dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::promote_types], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::qr.Q], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::qr], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::quantile.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::quantile.scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::quantile.scalar_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::quantile], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::quantized_gru_cell], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::quantized_lstm_cell], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::quantized_rnn_relu_cell], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::quantized_rnn_tanh_cell], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::rand.generator_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::randn.generator_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::randn.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::ravel], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::real], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::refine_names], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::relu6], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::relu6_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::rename], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::rename_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::repeat_interleave.self_Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::repeat_interleave.self_int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::requires_grad_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::reshape], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::reshape_as], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::resolve_conj], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::resolve_neg], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::result_type.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::result_type.Scalar_Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::result_type.Scalar_Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::result_type.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::retain_grad], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::retains_grad], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::rms_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::rnn_relu.data], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::rnn_relu.input], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::rnn_relu_cell], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::rnn_tanh.data], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::rnn_tanh.input], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::rnn_tanh_cell], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::row_stack.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::row_stack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::rrelu], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::rrelu_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::scaled_dot_product_attention], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::scatter.dimname_src], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::scatter.dimname_value], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::scatter_add.dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::select.Dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::selu], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::selu_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::set_.source_Tensor_storage_offset], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::set_data], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::silu_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::size.Dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::size.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::slogdet.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::slogdet], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::slow_conv3d.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::slow_conv3d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::smm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::softmax.Dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::softmax.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::sort.dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::sort.dimname_stable], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::sort.dimname_values], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::sort.dimname_values_stable], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::sparse_bsc_tensor.ccol_row_value], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::sparse_bsc_tensor.ccol_row_value_size], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::sparse_bsr_tensor.crow_col_value], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::sparse_bsr_tensor.crow_col_value_size], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::sparse_coo_tensor.indices], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::sparse_coo_tensor.indices_size], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::sparse_csc_tensor.ccol_row_value], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::sparse_csc_tensor.ccol_row_value_size], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::sparse_csr_tensor.crow_col_value], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::sparse_csr_tensor.crow_col_value_size], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_digamma.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_digamma], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_erf.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_erf], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_erfc.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_erfc], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_erfinv.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_erfinv], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_exp2.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_exp2], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_expit.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_expit], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_expm1.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_expm1], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_gammainc.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_gammainc], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_gammaincc.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_gammaincc], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_gammaln.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_gammaln], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_i0.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_i0], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_log1p.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_log1p], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_log_softmax], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_logit.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_logit], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_logsumexp.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_logsumexp], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_multigammaln.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_multigammaln], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_ndtr.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_ndtr], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_polygamma.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_polygamma], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_psi.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_psi], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_round.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_round], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_sinc.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_sinc], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_softmax], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_xlogy.other_scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_xlogy.other_scalar_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_xlogy.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_xlogy.self_scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_xlogy.self_scalar_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::special_xlogy], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::split.sizes], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::square.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::square], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::square_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::squeeze.dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::squeeze_.dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::sspaddmm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::std.correction_names], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::std.correction_names_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::std.dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::std.names_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::std.names_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::std.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::std], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::std_mean.correction_names], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::std_mean.dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::std_mean.names_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::std_mean], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::stft.center], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::stft], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::stride.Dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::stride.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::subtract.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::subtract.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::subtract.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::subtract_.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::subtract_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::sum.DimnameList_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::sum.dim_DimnameList], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::sum_to_size], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::svd.U], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::svd], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::swapaxes], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::swapaxes_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::swapdims], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::swapdims_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::sym_numel], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::sym_size.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::sym_storage_offset], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::sym_stride.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::take_along_dim.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::take_along_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::tensor_split.indices], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::tensor_split.sections], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::tensor_split.tensor_indices_or_sections], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::tensordot.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::tensordot], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::thnn_conv2d.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::thnn_conv2d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::tile], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::to.device], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::to.dtype], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::to.dtype_layout], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::to.other], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::to_dense], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::to_dense_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::to_mkldnn_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::to_sparse.sparse_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::to_sparse], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::to_sparse_bsc], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::to_sparse_bsr], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::to_sparse_csc], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::to_sparse_csr], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::trace_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::transpose.Dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::trapezoid.dx], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::trapezoid.x], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::trapz.dx], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::trapz.x], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::triplet_margin_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::true_divide.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::true_divide.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::true_divide.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::true_divide_.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::true_divide_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::type_as], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::unbind.Dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::unflatten.Dimname], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::unflatten.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::unflatten_dense_tensors], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::unsafe_chunk], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::upsample_bicubic2d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::upsample_bilinear2d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::upsample_linear1d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::upsample_nearest1d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::upsample_nearest2d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::upsample_nearest3d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::upsample_trilinear3d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::value_selecting_reduction_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::vander], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::var.correction_names], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::var.correction_names_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::var.dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::var.names_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::var.names_out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::var.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::var], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::var_mean.correction_names], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::var_mean.dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::var_mean.names_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::var_mean], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::view_as], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::vsplit.array], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::vsplit.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::vstack.out], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::vstack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::where.ScalarOther], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::where.ScalarSelf], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::where.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[aten::where], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[c10d_functional::all_gather_into_tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[c10d_functional::all_gather_into_tensor_coalesced], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[c10d_functional::all_reduce], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[c10d_functional::all_reduce_coalesced], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[c10d_functional::all_to_all_single], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[c10d_functional::broadcast], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[c10d_functional::reduce_scatter_tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[c10d_functional::reduce_scatter_tensor_coalesced], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[c10d_functional::wait_tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[inductor::_alloc_from_pool], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[mkldnn::_is_mkldnn_acl_supported], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[mkldnn::_is_mkldnn_bf16_supported], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[mkldnn::_is_mkldnn_fp16_supported], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[prepacked::unpack_prepacked_sizes_conv2d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[prepacked::unpack_prepacked_sizes_linear], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[profiler::_record_function_enter], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[profiler::_record_function_enter_new], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[profiler::_record_function_exit._RecordFunction], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[profiler::_record_function_exit], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv1d_unpack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv2d_dilation], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv2d_groups], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv2d_output_padding], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv2d_padding], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv2d_stride], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv2d_transpose], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv2d_unpack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv2d_unpack_sizes], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv3d_dilation], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv3d_groups], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv3d_output_padding], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv3d_padding], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv3d_stride], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv3d_transpose], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv3d_unpack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv_transpose1d_unpack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv_transpose2d_dilation], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv_transpose2d_groups], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv_transpose2d_output_padding], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv_transpose2d_padding], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv_transpose2d_stride], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv_transpose2d_transpose], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv_transpose2d_unpack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv_transpose3d_dilation], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv_transpose3d_groups], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv_transpose3d_output_padding], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv_transpose3d_padding], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv_transpose3d_stride], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv_transpose3d_transpose], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv_transpose3d_unpack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::conv_unpack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::embedding_bag_unpack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::linear_unpack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::linear_unpack_fp16], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[quantized::make_quantized_cell_params_fp16], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_a_batching_rule_for_composite_implicit_autograd_[sparse::qlinear_unpack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::__and__.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::__and__.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::__iand__.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::__iand__.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::__ior__.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::__ior__.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::__ixor__.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::__ixor__.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::__or__.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::__or__.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::__xor__.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::__xor__.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::_batch_norm_impl_index], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::_convolution_double_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::_convolution_mode], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::_fused_rms_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::_has_compatible_shallow_copy_type], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::_lu_with_info], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::_pad_circular], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::_scaled_dot_product_attention_math], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::_test_check_tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::_upsample_bicubic2d_aa.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::_upsample_bilinear2d_aa.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::absolute], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::absolute_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::adaptive_avg_pool1d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::adaptive_avg_pool2d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::adaptive_avg_pool3d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::adaptive_max_pool1d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::adjoint], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::alias_copy], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::arccos], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::arccos_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::arccosh], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::arccosh_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::arcsin], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::arcsin_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::arcsinh], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::arcsinh_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::arctan2], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::arctan2_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::arctan], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::arctan_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::arctanh], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::arctanh_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::argsort.stable], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::argsort], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::as_strided_copy], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::atleast_1d.Sequence], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::atleast_1d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::atleast_2d.Sequence], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::atleast_2d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::atleast_3d.Sequence], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::atleast_3d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::avg_pool1d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::batch_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::broadcast_tensors], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::broadcast_to], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::cartesian_prod], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::cdist], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::chunk], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::clip.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::clip], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::combinations], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::concat], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::concatenate], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::conj_physical], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::contiguous], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::conv1d.padding], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::conv1d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::conv2d.padding], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::conv2d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::conv3d.padding], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::conv3d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::conv_transpose1d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::conv_transpose2d.input], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::conv_transpose3d.input], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::corrcoef], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::cosine_embedding_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::cosine_similarity], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::cov], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::cross], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::cross_entropy_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::cumprod_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::cumulative_trapezoid.dx], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::cumulative_trapezoid.x], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::det], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::diag], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::diagonal_copy], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::diff], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::divide.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::divide.Scalar_mode], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::divide.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::divide.Tensor_mode], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::divide_.Scalar_mode], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::divide_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::divide_.Tensor_mode], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::dropout], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::dsplit.array], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::dsplit.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::dstack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::einsum], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::embedding_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::expand_as], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::fft_fft2], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::fft_fft], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::fft_fftn], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::fft_fftshift], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::fft_hfft2], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::fft_hfft], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::fft_hfftn], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::fft_ifft2], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::fft_ifft], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::fft_ifftn], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::fft_ifftshift], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::fft_ihfft], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::fft_irfft2], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::fft_irfft], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::fft_irfftn], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::fft_rfft2], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::fft_rfft], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::fft_rfftn], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::fix], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::flatten.using_ints], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::fliplr], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::flipud], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::float_power.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::float_power.Tensor_Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::float_power.Tensor_Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::frobenius_norm.dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::gather_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::ger], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::gradient.array], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::gradient.scalararray], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::gradient.scalarint], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::gradient.scalarrayarray], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::gradient.scalarrayint], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::gradient.tensorarray], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::gradient.tensorarrayint], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::greater.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::greater.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::greater_equal.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::greater_equal.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::grid_sampler], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::group_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::hinge_embedding_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::hsplit.array], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::hsplit.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::hstack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::imag], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::index_select_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::inner], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::instance_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::inverse], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::is_complex], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::is_same_size], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::isfinite], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::isreal], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::kron], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::l1_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::layer_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::ldexp.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::less.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::less.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::less_equal.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::less_equal.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_cholesky], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_cond], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_det], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_diagonal], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_eigh], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_eigvals], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_eigvalsh], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_inv], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_ldl_factor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_lu_factor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_matmul], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_matrix_norm.str_ord], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_matrix_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_matrix_power], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_matrix_rank.atol_rtol_float], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_matrix_rank.atol_rtol_tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_multi_dot], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_norm.ord_str], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_pinv.atol_rtol_float], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_pinv], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_solve], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_solve_ex], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_svd], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_svdvals], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_tensorinv], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_vander], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linalg_vecdot], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::linear], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::log_sigmoid], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::log_softmax.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::logdet], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::mH], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::mT], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::matmul], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::matrix_H], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::matrix_exp], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::matrix_power], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::max.other], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::max_pool1d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::max_pool1d_with_indices], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::max_pool2d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::max_pool3d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::meshgrid.indexing], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::meshgrid], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::min.other], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::moveaxis.intlist], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::movedim.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::movedim.intlist], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::msort], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::multiply.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::multiply.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::multiply_.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::multiply_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::nanmean], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::narrow], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::negative], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::nll_loss2d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::nll_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::nll_loss_nd], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::not_equal.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::not_equal.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::nuclear_norm.dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::nuclear_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::numpy_T], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::orgqr], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::outer], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::pad], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::pairwise_distance], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::pinverse], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::poisson_nll_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::positive], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::prelu], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::qr], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::ravel], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::real], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::relu6], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::relu6_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::repeat_interleave.self_Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::repeat_interleave.self_int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::reshape], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::reshape_as], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::resolve_conj], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::resolve_neg], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::result_type.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::result_type.Scalar_Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::result_type.Scalar_Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::result_type.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::rms_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::row_stack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::rrelu], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::rrelu_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::scaled_dot_product_attention], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::selu], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::selu_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::size.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::slogdet], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::softmax.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_digamma], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_erf], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_erfc], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_erfinv], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_exp2], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_expit], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_expm1], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_gammainc], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_gammaincc], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_gammaln], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_i0], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_log1p], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_log_softmax], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_logit], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_logsumexp], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_multigammaln], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_ndtr], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_polygamma], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_psi], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_round], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_sinc], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_softmax], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_xlogy.other_scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_xlogy.self_scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::special_xlogy], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::split.sizes], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::square], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::std.dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::std], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::std_mean.dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::std_mean], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::subtract.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::sum_to_size], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::svd], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::swapaxes], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::swapaxes_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::swapdims], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::swapdims_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::take_along_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::tensor_split.indices], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::tensor_split.sections], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::tensordot], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::tile], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::to.device], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::to.dtype], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::to.dtype_layout], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::to.other], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::trapezoid.dx], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::trapezoid.x], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::trapz.dx], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::trapz.x], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::true_divide.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::true_divide.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::true_divide_.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::true_divide_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::type_as], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::unflatten.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::unfold_copy], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::unsafe_chunk], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::upsample_bicubic2d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::upsample_bilinear2d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::upsample_linear1d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::upsample_nearest1d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::upsample_nearest2d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::upsample_nearest3d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::upsample_trilinear3d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::value_selecting_reduction_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::var.dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::var], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::var_mean.dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::var_mean], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::view_as], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::vsplit.array], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::vsplit.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::vstack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::where.ScalarOther], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::where.ScalarSelf], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_register_functorch_batched_decomposition_[aten::where.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::absolute], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::absolute_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::adaptive_avg_pool1d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::adaptive_avg_pool2d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::adaptive_avg_pool3d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::adaptive_max_pool1d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::adjoint], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::affine_grid_generator_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::align_as], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::align_tensors], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::align_to.ellipsis_idx], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::align_to], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::alpha_dropout], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::alpha_dropout_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::arccos], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::arccos_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::arccosh], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::arccosh_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::arcsin], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::arcsin_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::arcsinh], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::arcsinh_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::arctan2], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::arctan2_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::arctan], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::arctan_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::arctanh], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::arctanh_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::argsort.stable], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::argsort], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::argwhere], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::atleast_1d.Sequence], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::atleast_1d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::atleast_2d.Sequence], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::atleast_2d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::atleast_3d.Sequence], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::atleast_3d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::avg_pool1d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::batch_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::bilinear], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::broadcast_tensors], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::broadcast_to], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::can_cast], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::cartesian_prod], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::cat.names], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::cdist], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::chain_matmul], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::chalf], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::choose_qparams_optimized], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::chunk], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::clip.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::clip], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::clip_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::clip_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::coalesce], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::column_stack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::combinations], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::concat.names], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::concat], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::concatenate.names], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::concatenate], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::conj], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::conj_physical], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::contiguous], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::conv1d.padding], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::conv1d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::conv2d.padding], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::conv2d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::conv3d.padding], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::conv3d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::conv_tbc_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::conv_transpose1d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::conv_transpose2d.input], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::conv_transpose3d.input], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::corrcoef], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::cosine_embedding_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::cosine_similarity], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::cov], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::cross], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::cross_entropy_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::ctc_loss.IntList], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::ctc_loss.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::cudnn_is_acceptable], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::cummaxmin_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::cumprod_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::cumulative_trapezoid.dx], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::cumulative_trapezoid.x], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::data], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::det], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::diag], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::diagflat], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::diff], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::divide.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::divide.Scalar_mode], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::divide.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::divide.Tensor_mode], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::divide.out_mode], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::divide_.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::divide_.Scalar_mode], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::divide_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::divide_.Tensor_mode], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::dropout], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::dropout_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::dsplit.array], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::dsplit.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::dstack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::einsum], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::embedding_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::embedding_bag.padding_idx], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::embedding_bag], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::expand_as], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::feature_alpha_dropout], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::feature_alpha_dropout_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::feature_dropout], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::feature_dropout_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fft_fft2], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fft_fft], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fft_fftn], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fft_fftshift], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fft_hfft2], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fft_hfft], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fft_hfftn], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fft_ifft2], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fft_ifft], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fft_ifftn], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fft_ifftshift], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fft_ihfft2], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fft_ihfft], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fft_ihfftn], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fft_irfft2], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fft_irfft], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fft_irfftn], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fft_rfft2], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fft_rfft], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fft_rfftn], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fill_diagonal_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fix], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fix_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::flatten.named_out_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::flatten.using_ints], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::flatten.using_names], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::flatten_dense_tensors], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fliplr], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::flipud], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::float_power.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::float_power.Tensor_Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::float_power.Tensor_Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::float_power_.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::float_power_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::frobenius_norm.dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::fused_moving_avg_obs_fake_quant], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::gather_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::ger], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::get_gradients], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::gradient.array], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::gradient.scalararray], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::gradient.scalarint], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::gradient.scalarrayarray], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::gradient.scalarrayint], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::gradient.tensorarray], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::gradient.tensorarrayint], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::greater.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::greater.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::greater_.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::greater_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::greater_equal.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::greater_equal.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::greater_equal_.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::greater_equal_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::grid_sampler], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::group_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::gru.data], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::gru.input], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::gru_cell], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::hinge_embedding_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::histogramdd.TensorList_bins], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::histogramdd.int_bins], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::histogramdd], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::hsplit.array], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::hsplit.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::hstack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::imag], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::index_select_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::infinitely_differentiable_gelu_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::inner], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::instance_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::inverse], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::isclose], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::isfinite], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::isreal], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::istft], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::item], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::kl_div], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::kron], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::l1_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::layer_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::ldexp.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::ldexp_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::less.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::less.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::less_.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::less_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::less_equal.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::less_equal.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::less_equal_.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::less_equal_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_cholesky], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_cond.p_str], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_cond], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_det], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_diagonal], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_eigh.eigvals], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_eigh], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_eigvals], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_eigvalsh], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_inv], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_ldl_factor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_lu_factor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_matmul], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_matrix_norm.str_ord], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_matrix_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_matrix_power], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_matrix_rank.atol_rtol_float], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_matrix_rank.atol_rtol_tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_matrix_rank.out_tol_tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_matrix_rank.tol_tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_matrix_rank], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_multi_dot], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_norm.ord_str], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_pinv.atol_rtol_float], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_pinv.out_rcond_tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_pinv.rcond_tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_pinv], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_slogdet], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_solve], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_solve_ex], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_svd.U], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_svd], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_svdvals], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_tensorinv], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_tensorsolve], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_vander], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linalg_vecdot], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::linear], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::log_sigmoid], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::log_softmax.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::logdet], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::logsumexp.names], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::lstm.data], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::lstm.input], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::lstm_cell], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::lu_solve], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::mH], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::mT], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::margin_ranking_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::masked_select_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::matmul], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::matrix_H], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::matrix_exp], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::matrix_exp_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::matrix_power], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::max.names_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::max.names_dim_max], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::max.other], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::max_pool1d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::max_pool1d_with_indices], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::max_pool2d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::max_pool3d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::mean.names_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::median.names_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::median.names_dim_values], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::meshgrid.indexing], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::meshgrid], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::min.names_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::min.names_dim_min], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::min.other], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::mish_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::moveaxis.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::moveaxis.intlist], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::movedim.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::movedim.intlist], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::msort], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::multilabel_margin_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::multiply.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::multiply.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::multiply_.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::multiply_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::nanmean], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::nanmedian.names_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::nanmedian.names_dim_values], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::nanquantile.scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::nanquantile], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::narrow.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::narrow], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::native_channel_shuffle], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::negative], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::negative_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::nested_to_padded_tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::nll_loss2d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::nll_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::nll_loss_nd], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::nonzero_numpy], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::norm.names_ScalarOpt_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::norm.names_ScalarOpt_dim_dtype], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::norm_except_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::not_equal.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::not_equal.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::not_equal_.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::not_equal_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::nuclear_norm.dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::nuclear_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::numpy_T], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::one_hot], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::orgqr], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::outer], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::output_nr], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::pad], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::pad_sequence], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::pairwise_distance], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::pdist], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::pin_memory], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::pinverse], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::poisson_nll_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::positive], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::prelu], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::promote_types], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::qr.Q], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::qr], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::quantile.scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::quantile], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::ravel], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::real], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::refine_names], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::relu6], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::relu6_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::rename], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::rename_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::repeat_interleave.self_Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::repeat_interleave.self_int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::requires_grad_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::reshape], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::reshape_as], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::resolve_conj], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::resolve_neg], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::result_type.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::result_type.Scalar_Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::result_type.Scalar_Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::result_type.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::retain_grad], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::retains_grad], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::rms_norm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::rnn_relu.data], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::rnn_relu.input], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::rnn_relu_cell], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::rnn_tanh.data], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::rnn_tanh.input], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::rnn_tanh_cell], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::row_stack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::rrelu], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::rrelu_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::scaled_dot_product_attention], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::selu], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::selu_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::set_.source_Tensor_storage_offset], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::set_data], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::silu_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::size.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::slogdet], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::slow_conv3d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::smm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::softmax.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_digamma], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_erf], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_erfc], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_erfinv], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_exp2], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_expit], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_expm1], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_gammainc], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_gammaincc], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_gammaln], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_i0], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_log1p], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_log_softmax], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_logit], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_logsumexp], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_multigammaln], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_ndtr], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_polygamma], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_psi], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_round], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_sinc], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_softmax], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_xlogy.other_scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_xlogy.self_scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::special_xlogy], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::split.sizes], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::square], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::square_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::sspaddmm], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::std.correction_names], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::std.dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::std.names_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::std], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::std_mean.correction_names], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::std_mean.dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::std_mean.names_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::std_mean], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::stft.center], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::stft], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::stride.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::subtract.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::subtract.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::subtract_.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::subtract_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::sum_to_size], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::svd.U], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::svd], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::swapaxes], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::swapaxes_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::swapdims], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::swapdims_], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::sym_numel], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::sym_size.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::sym_storage_offset], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::sym_stride.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::take_along_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::tensor_split.indices], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::tensor_split.sections], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::tensor_split.tensor_indices_or_sections], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::tensordot], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::thnn_conv2d], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::tile], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::to.device], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::to.dtype], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::to.dtype_layout], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::to.other], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::to_dense], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::to_dense_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::to_mkldnn_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::trace_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::trapezoid.dx], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::trapezoid.x], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::trapz.dx], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::trapz.x], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::triplet_margin_loss], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::true_divide.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::true_divide.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::true_divide_.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::true_divide_.Tensor], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::type_as], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::unflatten.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::unflatten_dense_tensors], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::unsafe_chunk], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::upsample_bicubic2d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::upsample_bilinear2d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::upsample_linear1d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::upsample_nearest1d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::upsample_nearest2d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::upsample_nearest3d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::upsample_trilinear3d.vec], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::value_selecting_reduction_backward], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::vander], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::var.correction_names], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::var.dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::var.names_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::var], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::var_mean.correction_names], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::var_mean.dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::var_mean.names_dim], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::var_mean], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::view_as], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::vsplit.array], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::vsplit.int], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::vstack], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::where.ScalarOther], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::where.ScalarSelf], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::where.Scalar], test/functorch/test_vmap_registrations.py::TestFunctorchDispatcher::test_unimplemented_batched_registrations_[aten::where] 2025-08-14T22:47:10.0954846Z 2025-08-14T22:47:13.2831101Z 2025-08-14T22:47:13.2832102Z nn/test_parametrization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/nn.test_parametrization_1.1_ef1846533e323439_.log 2025-08-14T22:47:13.2856322Z Running 58 items in this shard: test/nn/test_parametrization.py::TestNNParametrization::test_caching_parametrization_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_caching_parametrization_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_caching_parametrization_with_transfer_parametrizations_and_params_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_caching_parametrization_with_transfer_parametrizations_and_params_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_deepcopy_after_parametrization_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_deepcopy_after_parametrization_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_errors_parametrized_tensor_parametrization_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_errors_parametrized_tensor_parametrization_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_errors_unparametrized_tensor_parametrization_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_errors_unparametrized_tensor_parametrization_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_initialization_parametrization_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_initialization_parametrization_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_multiple_inputs_parametrization_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_multiple_inputs_parametrization_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_new_spectral_norm_dim_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_new_spectral_norm_dim_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_new_spectral_norm_forward_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_new_spectral_norm_forward_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_new_spectral_norm_load_state_dict_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_new_spectral_norm_load_state_dict_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_new_spectral_norm_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_new_spectral_norm_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_new_spectral_norm_value_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_new_spectral_norm_value_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_orthogonal_errors_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_orthogonal_errors_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_orthogonal_parametrization_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_orthogonal_parametrization_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_parametrization_same_training_mode_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_parametrization_same_training_mode_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_register_and_remove_buffer_parametrization_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_register_and_remove_buffer_parametrization_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_register_and_remove_nested_parametrization_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_register_and_remove_nested_parametrization_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_register_and_remove_parametrization_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_register_and_remove_parametrization_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_register_parametrization_no_grad, test/nn/test_parametrization.py::TestNNParametrization::test_serialization_parametrization_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_serialization_parametrization_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_transfer_parametrizations_and_params_many_to_one_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_transfer_parametrizations_and_params_many_to_one_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_transfer_parametrizations_and_params_right_inverse_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_transfer_parametrizations_and_params_right_inverse_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_transfer_parametrizations_and_params_single_param_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_transfer_parametrizations_and_params_single_param_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_transfer_parametrizations_and_params_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_transfer_parametrizations_and_params_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_type_before_parametrizations_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_type_before_parametrizations_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_weight_norm_deepcopy_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_weight_norm_deepcopy_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_weight_norm_pickle_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_weight_norm_pickle_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_weight_norm_state_dict_compat_swap_False, test/nn/test_parametrization.py::TestNNParametrization::test_weight_norm_state_dict_compat_swap_True, test/nn/test_parametrization.py::TestNNParametrization::test_wrapper_subclass_parametrization_swap_True, test/nn/test_parametrization.py::TestNNParametrizationDeviceCUDA::test_weight_norm_parametrization_swap_False_cuda, test/nn/test_parametrization.py::TestNNParametrizationDeviceCUDA::test_weight_norm_parametrization_swap_True_cuda 2025-08-14T22:47:13.2879004Z 2025-08-14T22:47:13.9866519Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:47:14.2136608Z import pkg_resources 2025-08-14T22:47:14.2136825Z 2025-08-14T22:47:14.2137275Z nn/test_pruning 1/1 was successful, full logs can be found in artifacts with path test/test-reports/nn.test_pruning_1.1_805917eaffb3b67f_.log 2025-08-14T22:47:14.2146506Z Running 34 items in this shard: test/nn/test_pruning.py::TestPruningNN::test_compute_nparams_to_prune, test/nn/test_pruning.py::TestPruningNN::test_custom_from_mask_pruning, test/nn/test_pruning.py::TestPruningNN::test_global_pruning, test/nn/test_pruning.py::TestPruningNN::test_global_pruning_importance_scores, test/nn/test_pruning.py::TestPruningNN::test_identity_pruning, test/nn/test_pruning.py::TestPruningNN::test_l1_unstructured_pruning, test/nn/test_pruning.py::TestPruningNN::test_l1_unstructured_pruning_with_importance_scores, test/nn/test_pruning.py::TestPruningNN::test_ln_structured_pruning, test/nn/test_pruning.py::TestPruningNN::test_ln_structured_pruning_importance_scores, test/nn/test_pruning.py::TestPruningNN::test_multiple_pruning_calls, test/nn/test_pruning.py::TestPruningNN::test_prune, test/nn/test_pruning.py::TestPruningNN::test_prune_importance_scores, test/nn/test_pruning.py::TestPruningNN::test_prune_importance_scores_mimic_default, test/nn/test_pruning.py::TestPruningNN::test_pruning_container, test/nn/test_pruning.py::TestPruningNN::test_pruning_container_compute_mask, test/nn/test_pruning.py::TestPruningNN::test_pruning_id_consistency, test/nn/test_pruning.py::TestPruningNN::test_pruning_rollback, test/nn/test_pruning.py::TestPruningNN::test_pruning_serialization_model, test/nn/test_pruning.py::TestPruningNN::test_pruning_serialization_state_dict, test/nn/test_pruning.py::TestPruningNN::test_random_pruning, test/nn/test_pruning.py::TestPruningNN::test_random_pruning_0perc, test/nn/test_pruning.py::TestPruningNN::test_random_pruning_forward, test/nn/test_pruning.py::TestPruningNN::test_random_pruning_new_weight, test/nn/test_pruning.py::TestPruningNN::test_random_pruning_orig, test/nn/test_pruning.py::TestPruningNN::test_random_pruning_pickle, test/nn/test_pruning.py::TestPruningNN::test_random_pruning_sizes, test/nn/test_pruning.py::TestPruningNN::test_random_structured_pruning_amount, test/nn/test_pruning.py::TestPruningNN::test_remove_pruning, test/nn/test_pruning.py::TestPruningNN::test_remove_pruning_exception, test/nn/test_pruning.py::TestPruningNN::test_remove_pruning_forward, test/nn/test_pruning.py::TestPruningNN::test_rnn_pruning, test/nn/test_pruning.py::TestPruningNN::test_unstructured_pruning_same_magnitude, test/nn/test_pruning.py::TestPruningNN::test_validate_pruning_amount, test/nn/test_pruning.py::TestPruningNN::test_validate_pruning_amount_init 2025-08-14T22:47:14.2155623Z 2025-08-14T22:47:14.3793128Z Running functorch/test_dims 1/1 ... [2025-08-14 22:47:14.378775] 2025-08-14T22:47:14.3793577Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:47:14.3795981Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_dims.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:47:14.379208] 2025-08-14T22:47:17.4164015Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:47:17.4166437Z import pkg_resources 2025-08-14T22:47:17.8031817Z Running test_itt 1/1 ... [2025-08-14 22:47:17.802475] 2025-08-14T22:47:17.8032204Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:47:17.8033170Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_itt.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:47:17.802940] 2025-08-14T22:47:18.3611158Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:47:18.3613407Z import pkg_resources 2025-08-14T22:47:18.7438015Z Running torch_np/numpy_tests/core/test_multiarray 1/2 ... [2025-08-14 22:47:18.743253] 2025-08-14T22:47:18.7438520Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:47:18.7440350Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_multiarray.py', '-m', 'not serial', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:47:18.743680] 2025-08-14T22:47:19.4544227Z 2025-08-14T22:47:19.4545099Z functorch/test_dims 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_dims_1.1_e5c9ef7d186282ed_.log 2025-08-14T22:47:19.4560807Z Running 72 items in this shard: test/functorch/test_dims.py::TestMin::test_adapt, test/functorch/test_dims.py::TestMin::test_attn, test/functorch/test_dims.py::TestMin::test_attn_cuda, test/functorch/test_dims.py::TestMin::test_big_split, test/functorch/test_dims.py::TestMin::test_c, test/functorch/test_dims.py::TestMin::test_compare_dims, test/functorch/test_dims.py::TestMin::test_diag, test/functorch/test_dims.py::TestMin::test_dim_args, test/functorch/test_dims.py::TestMin::test_dims_with_size, test/functorch/test_dims.py::TestMin::test_dir, test/functorch/test_dims.py::TestMin::test_doc, test/functorch/test_dims.py::TestMin::test_embed, test/functorch/test_dims.py::TestMin::test_eq, test/functorch/test_dims.py::TestMin::test_expand, test/functorch/test_dims.py::TestMin::test_functorch, test/functorch/test_dims.py::TestMin::test_hello, test/functorch/test_dims.py::TestMin::test_index, test/functorch/test_dims.py::TestMin::test_index_placement, test/functorch/test_dims.py::TestMin::test_inplace, test/functorch/test_dims.py::TestMin::test_manual_stuff, test/functorch/test_dims.py::TestMin::test_mask, test/functorch/test_dims.py::TestMin::test_max, test/functorch/test_dims.py::TestMin::test_mm, test/functorch/test_dims.py::TestMin::test_mm_fuse, test/functorch/test_dims.py::TestMin::test_monkey, test/functorch/test_dims.py::TestMin::test_network, test/functorch/test_dims.py::TestMin::test_order, test/functorch/test_dims.py::TestMin::test_order_keyword, test/functorch/test_dims.py::TestMin::test_parse, test/functorch/test_dims.py::TestMin::test_permute_orig, test/functorch/test_dims.py::TestMin::test_seg, test/functorch/test_dims.py::TestMin::test_simple, test/functorch/test_dims.py::TestMin::test_softmax_split, test/functorch/test_dims.py::TestMin::test_stack, test/functorch/test_dims.py::TestMin::test_time_mm_fuse, test/functorch/test_dims.py::TestMin::test_with_dims_split, test/functorch/test_dims.py::TestMinFunctorchOnly::test_adapt, test/functorch/test_dims.py::TestMinFunctorchOnly::test_attn, test/functorch/test_dims.py::TestMinFunctorchOnly::test_attn_cuda, test/functorch/test_dims.py::TestMinFunctorchOnly::test_big_split, test/functorch/test_dims.py::TestMinFunctorchOnly::test_c, test/functorch/test_dims.py::TestMinFunctorchOnly::test_compare_dims, test/functorch/test_dims.py::TestMinFunctorchOnly::test_diag, test/functorch/test_dims.py::TestMinFunctorchOnly::test_dim_args, test/functorch/test_dims.py::TestMinFunctorchOnly::test_dims_with_size, test/functorch/test_dims.py::TestMinFunctorchOnly::test_dir, test/functorch/test_dims.py::TestMinFunctorchOnly::test_doc, test/functorch/test_dims.py::TestMinFunctorchOnly::test_embed, test/functorch/test_dims.py::TestMinFunctorchOnly::test_eq, test/functorch/test_dims.py::TestMinFunctorchOnly::test_expand, test/functorch/test_dims.py::TestMinFunctorchOnly::test_functorch, test/functorch/test_dims.py::TestMinFunctorchOnly::test_hello, test/functorch/test_dims.py::TestMinFunctorchOnly::test_index, test/functorch/test_dims.py::TestMinFunctorchOnly::test_index_placement, test/functorch/test_dims.py::TestMinFunctorchOnly::test_inplace, test/functorch/test_dims.py::TestMinFunctorchOnly::test_manual_stuff, test/functorch/test_dims.py::TestMinFunctorchOnly::test_mask, test/functorch/test_dims.py::TestMinFunctorchOnly::test_max, test/functorch/test_dims.py::TestMinFunctorchOnly::test_mm, test/functorch/test_dims.py::TestMinFunctorchOnly::test_mm_fuse, test/functorch/test_dims.py::TestMinFunctorchOnly::test_monkey, test/functorch/test_dims.py::TestMinFunctorchOnly::test_network, test/functorch/test_dims.py::TestMinFunctorchOnly::test_order, test/functorch/test_dims.py::TestMinFunctorchOnly::test_order_keyword, test/functorch/test_dims.py::TestMinFunctorchOnly::test_parse, test/functorch/test_dims.py::TestMinFunctorchOnly::test_permute_orig, test/functorch/test_dims.py::TestMinFunctorchOnly::test_seg, test/functorch/test_dims.py::TestMinFunctorchOnly::test_simple, test/functorch/test_dims.py::TestMinFunctorchOnly::test_softmax_split, test/functorch/test_dims.py::TestMinFunctorchOnly::test_stack, test/functorch/test_dims.py::TestMinFunctorchOnly::test_time_mm_fuse, test/functorch/test_dims.py::TestMinFunctorchOnly::test_with_dims_split 2025-08-14T22:47:19.4575705Z 2025-08-14T22:47:22.5776602Z 2025-08-14T22:47:22.5777892Z test_itt 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_itt_1.1_fd531bec2e9d2c9e_.log 2025-08-14T22:47:22.5779148Z Running 1 items in this shard: test/test_itt.py::TestItt::test_itt 2025-08-14T22:47:22.5779649Z 2025-08-14T22:47:23.5813575Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:47:23.5816291Z import pkg_resources 2025-08-14T22:47:23.9613913Z Running functorch/test_aotdispatch 1/1 ... [2025-08-14 22:47:23.960848] 2025-08-14T22:47:23.9614670Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:47:23.9616968Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_aotdispatch.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:47:23.961261] 2025-08-14T22:47:26.6868476Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:47:26.6869799Z import pkg_resources 2025-08-14T22:47:27.0669843Z Running test_schema_check 1/1 ... [2025-08-14 22:47:27.066463] 2025-08-14T22:47:27.0670380Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:47:27.0672625Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_schema_check.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:47:27.066855] 2025-08-14T22:47:31.4408078Z 2025-08-14T22:47:31.4409274Z functorch/test_aotdispatch 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_aotdispatch_1.1_abfc07815adbe635_.log 2025-08-14T22:47:31.4611104Z Running 527 items in this shard: test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_False_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_False_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_True_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_True_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_False_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_False_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_True_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_True_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_autocast_disable_guard, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_mutation_data, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_mutation_forward_inputs, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_mutation_forward_inputs_create_graph, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_mutation_on_grad_out, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_pass_autocast_custom, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_pass_autocast_off, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_pass_autocast_on, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_batch_norm_amp, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_batchnorm_inference, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_buffer_batch_norm, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_buffer_copied_in_graph, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_buffer_copied_in_graph_with_different_shapes, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_compilation_context, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_complex_linear, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_composite_impl_compile, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_custom_autograd, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_custom_tensor_metadata, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_default_partitioner_saves_symints_not_tensors_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_dupe_arg, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_dupe_arg_returned_as_output, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_dupe_arg_torture, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_duplicated_arguments_on_tensor_overlap, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_dynamic_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_dynamic_shape_output_not_in_bw_graph, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_embedding_bag_view_dynamic, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_grad_context, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_inference_mode, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_inner_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_aliased_with_mutation_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_data_and_metadata_mutation, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_data_and_metadata_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_inplace_requires_grad_true, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_metadata_mutation_aliases, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_alias_everything, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_aliases_and_none_require_gradients, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_aliases_and_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_aliases_bases_out_of_order, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_aliases_other_input2, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_and_output_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_false_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_hidden_from_autograd_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_is_output, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_metadata2, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_modifies_autograd_meta_of_aliases, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_multiple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_output_view_multiple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_requires_grad_detach, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_requires_grad_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_requires_grad_no_grad_detach_mixed, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_requires_grad_no_grad_inference_graph, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_return, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_set__input_mutation, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_set__nop, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_simple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_simple_with_none_and_nontensor, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_storage_resize_before_set_, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_storage_resize_down, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_storage_resize_down_and_set_, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_storage_resize_up, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_output_aliase_custom_autograd_function, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_output_view_metadata_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_output_view_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_output_view_simple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_invalid_dupe, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_invalid_dupe_fake, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_invalid_dupe_left_bias, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_invalid_requires_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_invalid_requires_grad_fake, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_list_codegen, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_mark_activations_dynamic, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_mark_activations_dynamic_with_nested, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_mark_outputs_dynamic_use_autograd_False, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_mark_outputs_dynamic_use_autograd_True, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_mem_leak_from_save_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_module, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_multi_output, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_multi_output_list, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_mutates_input_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_nested_subclasses, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_nested_subclasses_complicated_inps, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_nested_subclasses_complicated_inps_mixed, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_nested_subclasses_non_homogenous, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_nested_subclasses_non_nested_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_new_inp_requires_grad_now, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_no_grad_input_output, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_non_tensor_and_none_inputs, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_nonidempotent_amp, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_input_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_input_multi_output_view_should_raise_autograd_error, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_and_returned, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_and_returned_different_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_and_returned_flipped, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_inplace_view_and_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_inplace_view_with_detach, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_multiple_mixed, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_mutation_linear, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_returned_multiple_times, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_single, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_multiple_inputs_get_correct_one, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_output_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_all_alias_types, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_dict, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_op_depending_on_symint, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_outputs_are_aliased, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_real_weights_in_symbolic_mode, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_real_weights_in_symbolic_mode_with_inplace_ops, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_saved_tensors_hooks_mutations_raise, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_set__and_data_mutation_bad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_set__and_data_mutation_good, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_set__not_allowed, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_set__steals_view_chain, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_single_output, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_some_output_requires_grad_input_doesnt, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_some_outputs_dont_require_grad_non_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_some_outputs_dont_require_grad_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_squeeze_mutation, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_subclass_metadata_mutation_req_grad_False, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_subclass_metadata_mutation_req_grad_True, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_subclasses_mixed, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_subclasses_mixed_mode, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_synthetic_base_base_attribute_is_none, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_view_and_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_view_detach, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_ban_dropout_mut_pre_dispatch, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_forward_mutation_multiple_mut, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_forward_mutation_no_buffer_mut, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_functionalized_rng_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_input_dupes_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_input_mutation, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_input_mutation_on_input_requiring_grad_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_input_mutation_on_parameter_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_metadata_mutation_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_module_joint, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_multiple_outputs_require_grad_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_buffer_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_composite_implicit_inplace, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_composite_implicit_linear, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_contiguous, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_conv_and_bn, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_func_composite_implicit, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_func_simple, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_func_view, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_map_1, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_map_2, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_outdtype, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_reshape, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_with_autograd_op, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_with_cond, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_with_cond_nested, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_simplified_basic, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_simplified_pytrees_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_synthetic_bases_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_unbacked_arg, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_with_torch_cond, test/functorch/test_aotdispatch.py::TestPartitioning::test_autocast, test/functorch/test_aotdispatch.py::TestPartitioning::test_contiguous, test/functorch/test_aotdispatch.py::TestPartitioning::test_default_partitioner_getitem, test/functorch/test_aotdispatch.py::TestPartitioning::test_default_partitioner_output_tensor_shape_tensor, test/functorch/test_aotdispatch.py::TestPartitioning::test_generate_gives_inference_graph, test/functorch/test_aotdispatch.py::TestPartitioning::test_meta_tensor_inplace_op, test/functorch/test_aotdispatch.py::TestPartitioning::test_min_cut_partitioner, test/functorch/test_aotdispatch.py::TestPartitioning::test_min_cut_partitioner_output_tensor_shape_tensor, test/functorch/test_aotdispatch.py::TestPartitioning::test_min_cut_partitioner_raise_getitems, test/functorch/test_aotdispatch.py::TestPartitioning::test_min_cut_partitioner_save_shape, test/functorch/test_aotdispatch.py::TestPartitioning::test_preserve_random, test/functorch/test_aotdispatch.py::TestPartitioning::test_recompute_partitioning, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_incorrect_backward, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_inference, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_input_data_and_metadata_mutation, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_input_metadata_mutation, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_input_mutation, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_input_mutation_and_output_alias, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_output_alias, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_output_requires_grad_in_no_grad, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_output_requires_grad_in_no_grad_views, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_simple, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_aot_module_simplified, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_aot_module_simplified_dynamic, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_aot_module_simplified_fake_tensor_gm_raises, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_aot_module_simplified_preserves_stack_trace, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_aot_module_simplified_preserves_stack_trace_from_mutation, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_aot_test_subclasses_with_tensor_factories, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_flex_attn_noncontiguous_tangents, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_grads_no_force_contiguous_dense, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_grads_no_force_contiguous_nested_subclass, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_grads_no_force_contiguous_nested_tensor_tangent, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_grads_no_force_contiguous_subclass, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_inductor_freezing_with_subclasses, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_inference_python_dispatcher, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_lift_fresh_copy_in_graph, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_False_test_subclasses_False_device_cpu, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_False_test_subclasses_False_device_cuda, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_False_test_subclasses_True_device_cpu, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_False_test_subclasses_True_device_cuda, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_True_test_subclasses_False_device_cpu, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_True_test_subclasses_False_device_cuda, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_True_test_subclasses_True_device_cpu, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_True_test_subclasses_True_device_cuda, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_rrelu, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_rrelu_with_noise_mutation, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_saved_tensors_hooks_base_saved_tensors_hooks_filtering_mode_all, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_saved_tensors_hooks_base_saved_tensors_hooks_filtering_mode_donated, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_saved_tensors_hooks_base_saved_tensors_hooks_filtering_mode_no_static, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_saved_tensors_hooks_donated_buffers, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_saved_tensors_hooks_params, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_saved_tensors_hooks_recompile, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_subclass_parameters, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_subclass_parameters_torture_case, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_tangent_type_coercion, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_wrong_guess_tangent_type, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_False_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_False_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_True_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_True_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_False_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_False_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_True_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_True_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_autocast_disable_guard, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_mutation_data, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_mutation_forward_inputs, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_mutation_forward_inputs_create_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_mutation_on_grad_out, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_pass_autocast_custom, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_pass_autocast_off, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_pass_autocast_on, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_batch_norm_amp, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_batchnorm_inference, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_buffer_batch_norm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_buffer_copied_in_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_buffer_copied_in_graph_with_different_shapes, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_compilation_context, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_complex_linear, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_composite_impl_compile, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_custom_autograd, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_custom_tensor_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_default_partitioner_saves_symints_not_tensors_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_dupe_arg, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_dupe_arg_returned_as_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_dupe_arg_torture, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_duplicated_arguments_on_tensor_overlap, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_dynamic_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_dynamic_shape_output_not_in_bw_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_embedding_bag_view_dynamic, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_grad_context, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_inference_mode, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_inner_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_aliased_with_mutation_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_data_and_metadata_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_data_and_metadata_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_inplace_requires_grad_true, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_metadata_mutation_aliases, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_alias_everything, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_aliases_and_none_require_gradients, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_aliases_and_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_aliases_bases_out_of_order, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_aliases_other_input2, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_and_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_false_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_hidden_from_autograd_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_is_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_metadata2, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_modifies_autograd_meta_of_aliases, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_output_view_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_requires_grad_detach, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_requires_grad_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_requires_grad_no_grad_detach_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_requires_grad_no_grad_inference_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_return, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_set__input_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_set__nop, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_simple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_simple_with_none_and_nontensor, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_storage_resize_before_set_, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_storage_resize_down, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_storage_resize_down_and_set_, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_storage_resize_up, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_output_aliase_custom_autograd_function, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_output_view_metadata_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_output_view_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_output_view_simple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_inputs_overlapping_unsqueeze_with_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_inputs_overlapping_with_mutation_guard_base, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_invalid_dupe, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_invalid_dupe_fake, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_invalid_dupe_left_bias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_invalid_requires_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_invalid_requires_grad_fake, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_list_codegen, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mark_activations_dynamic, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mark_activations_dynamic_with_nested, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mark_outputs_dynamic_use_autograd_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mark_outputs_dynamic_use_autograd_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mem_leak_from_save_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_module, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_multi_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_multi_output_list, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mutates_input_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mutation_of_input_in_fw_and_bw, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mutations_in_bw_detached_from_tangent, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_nested_subclasses, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_nested_subclasses_complicated_inps, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_nested_subclasses_complicated_inps_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_nested_subclasses_non_homogenous, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_nested_subclasses_non_nested_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_new_inp_requires_grad_now, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_no_grad_input_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_non_tensor_and_none_inputs, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_nonidempotent_amp, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_input_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_input_multi_output_view_should_raise_autograd_error, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_and_returned, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_and_returned_different_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_and_returned_flipped, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_inplace_view_and_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_inplace_view_with_detach, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_multiple_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_mutation_linear, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_returned_multiple_times, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_single, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_multiple_inputs_get_correct_one, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_output_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_all_alias_types, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_dict, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_op_depending_on_symint, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_outputs_are_aliased, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_real_weights_in_symbolic_mode, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_real_weights_in_symbolic_mode_with_inplace_ops, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_saved_tensors_hooks_mutations_raise, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_set__and_data_mutation_bad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_set__and_data_mutation_good, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_set__not_allowed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_set__steals_view_chain, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_single_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_some_output_requires_grad_input_doesnt, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_some_outputs_dont_require_grad_non_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_some_outputs_dont_require_grad_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_squeeze_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_subclass_metadata_mutation_req_grad_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_subclass_metadata_mutation_req_grad_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_subclasses_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_subclasses_mixed_mode, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_synthetic_base_base_attribute_is_none, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_view_and_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_view_detach, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_False_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_False_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_True_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_True_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_False_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_False_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_True_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_True_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_autocast_disable_guard, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_mutation_data, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_mutation_forward_inputs, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_mutation_forward_inputs_create_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_mutation_on_grad_out, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_pass_autocast_custom, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_pass_autocast_off, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_pass_autocast_on, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_batch_norm_amp, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_batchnorm_inference, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_buffer_batch_norm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_buffer_copied_in_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_buffer_copied_in_graph_with_different_shapes, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_compilation_context, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_complex_linear, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_composite_impl_compile, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_custom_autograd, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_custom_tensor_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_default_partitioner_saves_symints_not_tensors_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_dupe_arg, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_dupe_arg_returned_as_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_dupe_arg_torture, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_duplicated_arguments_on_tensor_overlap, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_dynamic_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_dynamic_shape_output_not_in_bw_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_embedding_bag_view_dynamic, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_grad_context, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_inference_mode, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_inner_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_aliased_with_mutation_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_data_and_metadata_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_data_and_metadata_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_inplace_requires_grad_true, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_metadata_mutation_aliases, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_alias_everything, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_aliases_and_none_require_gradients, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_aliases_and_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_aliases_bases_out_of_order, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_aliases_other_input2, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_and_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_false_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_hidden_from_autograd_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_is_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_metadata2, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_modifies_autograd_meta_of_aliases, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_output_view_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_requires_grad_detach, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_requires_grad_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_requires_grad_no_grad_detach_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_requires_grad_no_grad_inference_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_return, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_set__input_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_set__nop, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_simple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_simple_with_none_and_nontensor, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_storage_resize_before_set_, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_storage_resize_down, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_storage_resize_down_and_set_, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_storage_resize_up, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_output_aliase_custom_autograd_function, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_output_view_metadata_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_output_view_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_output_view_simple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_inputs_overlapping_unsqueeze_with_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_inputs_overlapping_with_mutation_guard_base, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_invalid_dupe, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_invalid_dupe_fake, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_invalid_dupe_left_bias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_invalid_requires_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_invalid_requires_grad_fake, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_list_codegen, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mark_activations_dynamic, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mark_activations_dynamic_with_nested, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mark_outputs_dynamic_use_autograd_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mark_outputs_dynamic_use_autograd_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mem_leak_from_save_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_module, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_multi_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_multi_output_list, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mutates_input_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mutation_of_input_in_fw_and_bw, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mutations_in_bw_detached_from_tangent, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_nested_subclasses, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_nested_subclasses_complicated_inps, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_nested_subclasses_complicated_inps_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_nested_subclasses_non_homogenous, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_nested_subclasses_non_nested_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_new_inp_requires_grad_now, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_no_grad_input_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_non_tensor_and_none_inputs, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_nonidempotent_amp, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_input_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_input_multi_output_view_should_raise_autograd_error, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_and_returned, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_and_returned_different_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_and_returned_flipped, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_inplace_view_and_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_inplace_view_with_detach, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_multiple_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_mutation_linear, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_returned_multiple_times, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_single, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_multiple_inputs_get_correct_one, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_output_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_all_alias_types, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_dict, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_op_depending_on_symint, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_outputs_are_aliased, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_real_weights_in_symbolic_mode, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_real_weights_in_symbolic_mode_with_inplace_ops, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_saved_tensors_hooks_mutations_raise, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_set__and_data_mutation_bad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_set__and_data_mutation_good, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_set__not_allowed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_set__steals_view_chain, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_single_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_some_output_requires_grad_input_doesnt, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_some_outputs_dont_require_grad_non_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_some_outputs_dont_require_grad_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_squeeze_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_subclass_metadata_mutation_req_grad_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_subclass_metadata_mutation_req_grad_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_subclasses_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_subclasses_mixed_mode, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_synthetic_base_base_attribute_is_none, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_view_and_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_view_detach 2025-08-14T22:47:31.4805779Z 2025-08-14T22:47:35.4803286Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:47:35.4804688Z import pkg_resources 2025-08-14T22:47:35.8568363Z Running test_datapipe 1/1 ... [2025-08-14 22:47:35.856318] 2025-08-14T22:47:35.8568887Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:47:35.8571785Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_datapipe.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:47:35.856749] 2025-08-14T22:47:40.7808992Z 2025-08-14T22:47:40.7809986Z test_datapipe 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_datapipe_1.1_8a41c5819fb055b3_.log 2025-08-14T22:47:40.7850573Z Running 94 items in this shard: test/test_datapipe.py::TestDataChunk::test_as_string, test/test_datapipe.py::TestDataChunk::test_getitem, test/test_datapipe.py::TestDataChunk::test_iter, test/test_datapipe.py::TestDataChunk::test_len, test/test_datapipe.py::TestDataChunk::test_random_shuffle, test/test_datapipe.py::TestDataChunk::test_reverse, test/test_datapipe.py::TestDataChunk::test_sort, test/test_datapipe.py::TestStreamWrapper::test_api, test/test_datapipe.py::TestStreamWrapper::test_dir, test/test_datapipe.py::TestStreamWrapper::test_pickle, test/test_datapipe.py::TestStreamWrapper::test_repr, test/test_datapipe.py::TestIterableDataPipeBasic::test_demux_mux_datapipe, test/test_datapipe.py::TestIterableDataPipeBasic::test_groupby_iterable_datapipe, test/test_datapipe.py::TestIterableDataPipeBasic::test_listdirfiles_iterable_datapipe, test/test_datapipe.py::TestIterableDataPipeBasic::test_listdirfilesdeterministic_iterable_datapipe, test/test_datapipe.py::TestIterableDataPipeBasic::test_map_with_col_file_handle_datapipe, test/test_datapipe.py::TestIterableDataPipeBasic::test_openfilesfromdisk_iterable_datapipe, test/test_datapipe.py::TestIterableDataPipeBasic::test_routeddecoder_iterable_datapipe, test/test_datapipe.py::TestCaptureDataFrame::test_basic_capture, test/test_datapipe.py::TestDataFramesPipes::test_batch, test/test_datapipe.py::TestDataFramesPipes::test_capture, test/test_datapipe.py::TestDataFramesPipes::test_collate, test/test_datapipe.py::TestDataFramesPipes::test_filter, test/test_datapipe.py::TestDataFramesPipes::test_shuffle, test/test_datapipe.py::TestDataFramesPipes::test_unbatch, test/test_datapipe.py::TestFunctionalIterDataPipe::test_batch_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_collate_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_concat_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_demux_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_docstring, test/test_datapipe.py::TestFunctionalIterDataPipe::test_filter_datapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_fork_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_iterable_wrapper_datapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_map_dict_with_col_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_map_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_map_tuple_list_with_col_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_mux_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_sampler_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_serializable, test/test_datapipe.py::TestFunctionalIterDataPipe::test_serializable_with_dill, test/test_datapipe.py::TestFunctionalIterDataPipe::test_shuffler_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_stream_reader_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_unbatch_iterdatapipe, test/test_datapipe.py::TestFunctionalIterDataPipe::test_zip_iterdatapipe, test/test_datapipe.py::TestFunctionalMapDataPipe::test_batch_mapdatapipe, test/test_datapipe.py::TestFunctionalMapDataPipe::test_concat_mapdatapipe, test/test_datapipe.py::TestFunctionalMapDataPipe::test_docstring, test/test_datapipe.py::TestFunctionalMapDataPipe::test_map_mapdatapipe, test/test_datapipe.py::TestFunctionalMapDataPipe::test_sequence_wrapper_datapipe, test/test_datapipe.py::TestFunctionalMapDataPipe::test_serializable, test/test_datapipe.py::TestFunctionalMapDataPipe::test_serializable_with_dill, test/test_datapipe.py::TestFunctionalMapDataPipe::test_shuffler_mapdatapipe, test/test_datapipe.py::TestFunctionalMapDataPipe::test_zip_mapdatapipe, test/test_datapipe.py::TestTyping::test_compile_time, test/test_datapipe.py::TestTyping::test_construct_time, test/test_datapipe.py::TestTyping::test_isinstance, test/test_datapipe.py::TestTyping::test_issubinstance, test/test_datapipe.py::TestTyping::test_protocol, test/test_datapipe.py::TestTyping::test_reinforce, test/test_datapipe.py::TestTyping::test_runtime, test/test_datapipe.py::TestTyping::test_subtype, test/test_datapipe.py::TestGraph::test_simple_traverse, test/test_datapipe.py::TestGraph::test_traverse_circular_datapipe, test/test_datapipe.py::TestGraph::test_traverse_forked, test/test_datapipe.py::TestGraph::test_traverse_mapdatapipe, test/test_datapipe.py::TestGraph::test_traverse_mixdatapipe, test/test_datapipe.py::TestGraph::test_traverse_unhashable_datapipe, test/test_datapipe.py::TestSerialization::test_spawn_lambdas_iter, test/test_datapipe.py::TestSerialization::test_spawn_lambdas_map, test/test_datapipe.py::TestCircularSerialization::test_circular_serialization_with_dill, test/test_datapipe.py::TestCircularSerialization::test_circular_serialization_with_pickle, test/test_datapipe.py::TestSharding::test_legacy_custom_sharding, test/test_datapipe.py::TestSharding::test_legacy_custom_sharding_with_old_dataloader, test/test_datapipe.py::TestSharding::test_multi_sharding, test/test_datapipe.py::TestSharding::test_old_dataloader, test/test_datapipe.py::TestSharding::test_sharding_groups, test/test_datapipe.py::TestSharding::test_sharding_groups_in_legacy_grouping_package, test/test_datapipe.py::TestSharding::test_sharding_length, test/test_datapipe.py::TestSharding::test_simple_sharding, test/test_datapipe.py::TestIterDataPipeSingletonConstraint::test_iterdatapipe_singleton_buggy, test/test_datapipe.py::TestIterDataPipeSingletonConstraint::test_iterdatapipe_singleton_constraint_multiple_outputs, test/test_datapipe.py::TestIterDataPipeSingletonConstraint::test_iterdatapipe_singleton_generator, test/test_datapipe.py::TestIterDataPipeSingletonConstraint::test_iterdatapipe_singleton_new_object, test/test_datapipe.py::TestIterDataPipeSingletonConstraint::test_iterdatapipe_singleton_self_next, test/test_datapipe.py::TestIterDataPipeCountSampleYielded::test_iterdatapipe_sample_yielded_generator_function, test/test_datapipe.py::TestIterDataPipeCountSampleYielded::test_iterdatapipe_sample_yielded_generator_function_exception, test/test_datapipe.py::TestIterDataPipeCountSampleYielded::test_iterdatapipe_sample_yielded_next, test/test_datapipe.py::TestIterDataPipeCountSampleYielded::test_iterdatapipe_sample_yielded_next_exception, test/test_datapipe.py::TestIterDataPipeCountSampleYielded::test_iterdatapipe_sample_yielded_return_self, test/test_datapipe.py::TestIterDataPipeGraphFastForward::test_simple_snapshot_custom_non_generator, test/test_datapipe.py::TestIterDataPipeGraphFastForward::test_simple_snapshot_custom_self_next, test/test_datapipe.py::TestIterDataPipeGraphFastForward::test_simple_snapshot_graph, test/test_datapipe.py::TestIterDataPipeGraphFastForward::test_simple_snapshot_graph_repeated, test/test_datapipe.py::TestIterDataPipeGraphFastForward::test_simple_snapshot_graph_with_serialization 2025-08-14T22:47:40.7889989Z 2025-08-14T22:47:44.8692406Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:47:44.8693727Z import pkg_resources 2025-08-14T22:47:45.2467263Z Running test_bundled_inputs 1/1 ... [2025-08-14 22:47:45.246177] 2025-08-14T22:47:45.2467681Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:47:45.2469302Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_bundled_inputs.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:47:45.246584] 2025-08-14T22:47:45.7687834Z 2025-08-14T22:47:45.7688899Z test_schema_check 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_schema_check_1.1_704d08f6f136a409_.log 2025-08-14T22:47:46.0441795Z Running 5988 items in this shard: test/test_schema_check.py::TestSchemaCheck::test_alias_check_fail_custom_ops_output_is_input, test/test_schema_check.py::TestSchemaCheck::test_alias_check_fail_custom_ops_secretly_aliasing, test/test_schema_check.py::TestSchemaCheck::test_alias_check_fail_custom_ops_secretly_mutating, test/test_schema_check.py::TestSchemaCheck::test_alias_check_fail_multiple_operators, test/test_schema_check.py::TestSchemaCheck::test_alias_check_fail_multiple_operators_centered, test/test_schema_check.py::TestSchemaCheck::test_alias_check_fail_outputs_unexpectedly_aliasing, test/test_schema_check.py::TestSchemaCheck::test_alias_check_fail_simple, test/test_schema_check.py::TestSchemaCheck::test_is_alias_of_basic, test/test_schema_check.py::TestSchemaCheck::test_is_alias_of_empty_container, test/test_schema_check.py::TestSchemaCheck::test_mutation_check_fail, test/test_schema_check.py::TestSchemaCheck::test_mutation_check_fail_multiple_operators, test/test_schema_check.py::TestSchemaCheck::test_overlaps_basic, test/test_schema_check.py::TestSchemaCheck::test_overlaps_empty_container, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_empty_list_input, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality_aliasing_inputs, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality_default_replaced, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality_device_input, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality_kwarg_tensor, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality_list_input, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality_mutable_inputs, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality_nested_training_op, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality_training_op, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality_wildcard_after, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality_with_multiple_outputs, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality_with_multiple_outputs_aliasing, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_mutated_aliasing_aliasing_inputs, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_mutated_aliasing_aliasing_outputs, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_mutated_aliasing_as_strided, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_mutated_aliasing_multiple_outputs, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_mutated_aliasing_mutation, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_mutated_aliasing_none, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_mutated_aliasing_resize_, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_operator_order, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_operator_order_without_grad, test/test_schema_check.py::TestSchemaCheck::test_schema_info_bind_basic, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rand___cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rand___cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rand___cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rand___cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rand___cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rand___cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmatmul___cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmatmul___cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmatmul___cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmatmul___cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmatmul___cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmatmul___cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmod___cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmod___cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmod___cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmod___cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmod___cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmod___cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmod___cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmod___cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmod___cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___ror___cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___ror___cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___ror___cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___ror___cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___ror___cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___ror___cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rpow___cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rpow___cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rpow___cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rpow___cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rpow___cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rpow___cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rpow___cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rpow___cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rpow___cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rpow___cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rpow___cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rsub___cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rsub___cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rsub___cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rsub___cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rsub___cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rsub___cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rsub___cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rsub___cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rsub___cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rsub___cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rsub___cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rxor___cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rxor___cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rxor___cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rxor___cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rxor___cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rxor___cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__batch_norm_with_update_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__batch_norm_with_update_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__batch_norm_with_update_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__batch_norm_with_update_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__native_batch_norm_legit_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__native_batch_norm_legit_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__native_batch_norm_legit_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__native_batch_norm_legit_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__segment_reduce_lengths_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__segment_reduce_lengths_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__segment_reduce_lengths_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__segment_reduce_lengths_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__segment_reduce_offsets_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__segment_reduce_offsets_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__segment_reduce_offsets_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__segment_reduce_offsets_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__softmax_backward_data_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__softmax_backward_data_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__softmax_backward_data_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__softmax_backward_data_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__upsample_bilinear2d_aa_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__upsample_bilinear2d_aa_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__upsample_bilinear2d_aa_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__upsample_bilinear2d_aa_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addbmm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addbmm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addbmm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addbmm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addbmm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addbmm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcdiv_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcdiv_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcdiv_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcdiv_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcdiv_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcdiv_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcmul_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcmul_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcmul_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcmul_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcmul_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcmul_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcmul_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcmul_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcmul_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcmul_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcmul_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_decomposed_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_decomposed_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_decomposed_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_decomposed_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_decomposed_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_decomposed_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmv_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmv_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmv_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmv_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmv_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmv_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_allclose_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_allclose_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_allclose_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_allclose_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_allclose_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_allclose_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amax_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amax_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amax_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amax_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amax_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amax_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amin_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amin_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amin_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amin_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amin_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amin_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_aminmax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_aminmax_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_aminmax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_aminmax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_aminmax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_aminmax_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_aminmax_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_aminmax_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_aminmax_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_aminmax_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_angle_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_angle_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_angle_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_angle_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_angle_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_angle_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_angle_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_angle_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_angle_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_angle_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_angle_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_arange_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_arange_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_arange_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_arange_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_arange_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_arange_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_arange_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_arange_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_arange_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmax_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmax_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmax_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmax_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmax_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmin_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmin_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmin_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmin_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmin_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argsort_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argsort_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argsort_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argsort_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argsort_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argsort_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argsort_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argsort_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argsort_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argsort_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan2_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan2_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan2_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan2_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan2_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan2_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan2_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan2_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan2_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan2_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_baddbmm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_baddbmm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_baddbmm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_baddbmm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_baddbmm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_baddbmm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bernoulli_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bernoulli_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bernoulli_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bernoulli_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bincount_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bincount_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bincount_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bincount_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bincount_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_and_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_and_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_and_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_and_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_and_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_and_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_left_shift_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_left_shift_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_left_shift_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_left_shift_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_left_shift_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_not_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_not_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_not_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_not_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_not_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_not_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_or_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_or_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_or_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_or_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_or_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_or_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_right_shift_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_right_shift_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_right_shift_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_right_shift_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_right_shift_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_xor_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_xor_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_xor_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_xor_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_xor_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_xor_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bmm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bmm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bmm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bmm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bmm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bmm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_shapes_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bucketize_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bucketize_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bucketize_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bucketize_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bucketize_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bucketize_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bucketize_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bucketize_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bucketize_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cauchy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cauchy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cauchy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cauchy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdist_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdist_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ceil_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ceil_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ceil_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ceil_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ceil_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ceil_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ceil_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ceil_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ceil_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_inverse_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_inverse_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_inverse_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_inverse_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_solve_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_solve_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_solve_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_solve_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_max_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_max_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_max_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_max_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_max_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_max_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_max_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_max_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_max_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_max_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_min_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_min_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_min_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_min_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_min_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_min_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_min_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_min_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_min_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_min_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_complex_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_complex_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_complex_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_copysign_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_copysign_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_copysign_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_copysign_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_copysign_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_copysign_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_copysign_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_copysign_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_copysign_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_copysign_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_corrcoef_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_corrcoef_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_corrcoef_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_corrcoef_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_corrcoef_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_corrcoef_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_corrcoef_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_corrcoef_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_corrcoef_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_corrcoef_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_corrcoef_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cov_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cov_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cov_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cov_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cov_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cov_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cov_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cov_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cov_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cov_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cov_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cross_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cross_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cross_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cross_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cross_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cross_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cross_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cross_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cross_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cross_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cross_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummax_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummax_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummax_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummax_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummax_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummax_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummin_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummin_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummin_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummin_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummin_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummin_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumprod_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumprod_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumprod_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumprod_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumprod_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumprod_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumprod_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumprod_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumprod_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumprod_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumprod_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumsum_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumsum_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumsum_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumsum_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumsum_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumsum_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumsum_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumsum_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumsum_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumsum_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumsum_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumulative_trapezoid_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumulative_trapezoid_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumulative_trapezoid_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumulative_trapezoid_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumulative_trapezoid_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumulative_trapezoid_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumulative_trapezoid_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumulative_trapezoid_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumulative_trapezoid_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumulative_trapezoid_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumulative_trapezoid_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_deg2rad_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_deg2rad_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_deg2rad_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_deg2rad_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_deg2rad_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_deg2rad_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_deg2rad_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_deg2rad_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_deg2rad_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_deg2rad_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_digamma_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_digamma_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_digamma_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_digamma_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_digamma_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_digamma_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_digamma_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_digamma_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_digamma_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_digamma_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dist_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dist_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dist_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dist_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dist_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dist_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_floor_rounding_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_floor_rounding_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_floor_rounding_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_floor_rounding_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_floor_rounding_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_floor_rounding_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_floor_rounding_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_floor_rounding_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_floor_rounding_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_trunc_rounding_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_trunc_rounding_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_trunc_rounding_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_trunc_rounding_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_trunc_rounding_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_trunc_rounding_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_trunc_rounding_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_trunc_rounding_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_trunc_rounding_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dot_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dot_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dot_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dot_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dot_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dot_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_einsum_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_einsum_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_einsum_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_einsum_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_einsum_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_einsum_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erf_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erf_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erf_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erf_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erf_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erf_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erf_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erf_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erf_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erf_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfc_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfc_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfc_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfc_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfc_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfc_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfc_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfc_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfc_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfc_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfinv_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfinv_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfinv_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfinv_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfinv_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfinv_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfinv_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfinv_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfinv_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfinv_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exponential_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exponential_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exponential_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exponential_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_float8_e4m3fn, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_float8_e4m3fnuz, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_float8_e5m2, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_float8_e5m2fnuz, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft2_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft2_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft2_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft2_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft2_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft2_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft2_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft2_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft2_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfftn_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfftn_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfftn_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfftn_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfftn_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfftn_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfftn_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfftn_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfftn_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft2_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft2_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft2_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft2_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft2_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft2_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft2_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft2_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft2_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfftn_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfftn_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfftn_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfftn_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfftn_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfftn_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfftn_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfftn_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfftn_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_divide_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_divide_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_divide_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_divide_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_divide_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_divide_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_divide_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_divide_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_divide_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmax_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmax_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmax_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmax_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmax_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmax_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmin_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmin_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmin_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmin_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmin_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmin_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmod_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmod_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmod_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmod_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmod_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmod_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmod_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmod_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmod_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_frac_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_frac_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_frac_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_frac_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_frexp_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_frexp_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_frexp_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_frexp_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_uint16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_uint32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gcd_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gcd_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gcd_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gcd_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gcd_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ge_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ge_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ge_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ge_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ge_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ge_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ge_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ge_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ge_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ge_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geometric_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geometric_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geometric_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geometric_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geometric_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geometric_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geometric_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geometric_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geometric_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geqrf_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geqrf_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geqrf_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geqrf_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gradient_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gradient_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gradient_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gradient_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gradient_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gradient_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gradient_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gradient_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gradient_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gradient_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_grid_sampler_2d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_grid_sampler_2d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_grid_sampler_2d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_grid_sampler_2d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gt_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gt_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gt_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gt_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gt_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gt_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gt_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gt_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gt_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gt_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hash_tensor_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hash_tensor_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hash_tensor_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hash_tensor_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hash_tensor_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hash_tensor_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hash_tensor_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hash_tensor_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hash_tensor_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hash_tensor_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_heaviside_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_heaviside_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_heaviside_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_heaviside_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_heaviside_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_heaviside_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_heaviside_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_heaviside_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_heaviside_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_heaviside_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_histc_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_histc_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_histc_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_histc_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_histc_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_histc_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_histc_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hypot_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hypot_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hypot_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hypot_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_i0_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_i0_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_i0_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_i0_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_i0_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_i0_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_i0_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_i0_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_i0_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_i0_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_igamma_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_igamma_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_igammac_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_igammac_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_imag_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_imag_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_imag_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amax_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amax_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amax_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amax_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amax_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amin_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amin_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amin_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amin_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amin_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_mean_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_mean_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_mean_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_mean_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_mean_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_mean_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_mean_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_mean_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_mean_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_prod_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_prod_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_prod_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_prod_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_prod_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_prod_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_prod_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_prod_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_prod_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_inner_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_inner_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_inner_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_inner_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_inner_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_inner_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isin_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isin_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isin_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isin_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isin_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isneginf_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isneginf_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isneginf_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isneginf_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isneginf_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isneginf_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isneginf_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isneginf_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isneginf_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isneginf_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isposinf_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isposinf_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isposinf_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isposinf_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isposinf_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isposinf_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isposinf_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isposinf_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isposinf_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isposinf_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_istft_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_istft_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kthvalue_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kthvalue_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kthvalue_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kthvalue_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kthvalue_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kthvalue_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kthvalue_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kthvalue_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kthvalue_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lcm_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lcm_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lcm_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lcm_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lcm_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_le_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_le_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_le_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_le_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_le_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_le_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_le_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_le_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_le_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_le_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lerp_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lerp_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lerp_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lerp_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lerp_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lerp_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lerp_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lgamma_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lgamma_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lgamma_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lgamma_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lgamma_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lgamma_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lgamma_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lgamma_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lgamma_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lgamma_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cholesky_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cholesky_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cholesky_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cholesky_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cholesky_ex_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cholesky_ex_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cholesky_ex_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cholesky_ex_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cond_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cond_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cond_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cond_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cross_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cross_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cross_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cross_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cross_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cross_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cross_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cross_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cross_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cross_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cross_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_det_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_det_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_det_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_det_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eig_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eig_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eig_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eig_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigh_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigh_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigh_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigh_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigvals_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigvals_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigvals_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigvals_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigvalsh_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigvalsh_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigvalsh_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigvalsh_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_householder_product_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_householder_product_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_householder_product_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_householder_product_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_inv_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_inv_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_inv_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_inv_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_inv_ex_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_inv_ex_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_inv_ex_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_inv_ex_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_factor_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_factor_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_factor_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_factor_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_factor_ex_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_factor_ex_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_factor_ex_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_factor_ex_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_solve_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_solve_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_solve_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_solve_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lstsq_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lstsq_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lstsq_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lstsq_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lstsq_grad_oriented_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lstsq_grad_oriented_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lstsq_grad_oriented_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lstsq_grad_oriented_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_factor_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_factor_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_factor_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_factor_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_factor_ex_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_factor_ex_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_factor_ex_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_factor_ex_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_solve_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_solve_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_solve_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_solve_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_norm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_norm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_power_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_power_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_power_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_power_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_rank_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_rank_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_rank_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_rank_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_rank_hermitian_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_rank_hermitian_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_rank_hermitian_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_rank_hermitian_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_multi_dot_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_multi_dot_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_multi_dot_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_multi_dot_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_multi_dot_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_multi_dot_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_subgradients_at_zero_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_subgradients_at_zero_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_subgradients_at_zero_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_subgradients_at_zero_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_hermitian_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_hermitian_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_hermitian_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_hermitian_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_singular_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_singular_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_singular_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_singular_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_qr_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_qr_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_qr_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_qr_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_slogdet_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_slogdet_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_slogdet_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_slogdet_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_ex_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_ex_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_ex_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_ex_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_triangular_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_triangular_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_triangular_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_triangular_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_svd_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_svd_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_svd_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_svd_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_svdvals_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_svdvals_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_svdvals_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_svdvals_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_tensorinv_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_tensorinv_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_tensorinv_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_tensorinv_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_tensorsolve_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_tensorsolve_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_tensorsolve_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_tensorsolve_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vander_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vander_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vander_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vander_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vander_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vander_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vander_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vander_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vander_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vecdot_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vecdot_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vecdot_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vecdot_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vecdot_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vecdot_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vector_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vector_norm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vector_norm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vector_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vector_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vector_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_tensor_overload_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_tensor_overload_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_tensor_overload_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_tensor_overload_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_tensor_overload_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_tensor_overload_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_tensor_overload_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_tensor_overload_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_tensor_overload_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_tensor_overload_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_tensor_overload_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_normal_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_normal_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_normal_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_normal_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logaddexp2_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logaddexp2_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logaddexp2_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logaddexp2_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logaddexp_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logaddexp_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logaddexp_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logaddexp_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logcumsumexp_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logcumsumexp_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logcumsumexp_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logcumsumexp_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logcumsumexp_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logcumsumexp_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logdet_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logdet_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logdet_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logdet_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logit_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logit_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logit_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logit_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logit_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logit_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logit_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logit_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logit_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logit_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_tensor_overload_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_tensor_overload_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_tensor_overload_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_tensor_overload_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_tensor_overload_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_tensor_overload_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_tensor_overload_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_tensor_overload_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_tensor_overload_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_tensor_overload_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_tensor_overload_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lt_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lt_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lt_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lt_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lt_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lt_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lt_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lt_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lt_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lt_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_solve_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_solve_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_solve_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_solve_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_unpack_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_unpack_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_unpack_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_unpack_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amax_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amax_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amax_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amax_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amax_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amin_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amin_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amin_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amin_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amin_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmax_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmax_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmax_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmax_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmax_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmin_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmin_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmin_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmin_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmin_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumprod_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumprod_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumprod_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumprod_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumprod_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumprod_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumprod_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumprod_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumprod_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumprod_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumprod_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumsum_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumsum_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumsum_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumsum_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumsum_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumsum_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumsum_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumsum_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumsum_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumsum_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumsum_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_log_softmax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_log_softmax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_log_softmax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_log_softmax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logaddexp_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logaddexp_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logaddexp_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logaddexp_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logsumexp_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logsumexp_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logsumexp_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logsumexp_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logsumexp_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logsumexp_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logsumexp_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logsumexp_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logsumexp_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logsumexp_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logsumexp_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_mean_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_mean_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_mean_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_mean_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_mean_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_mean_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_median_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_median_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_median_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_median_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_normalize_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_normalize_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_normalize_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_normalize_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_normalize_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_normalize_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_softmax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_softmax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_softmax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_softmax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_softmin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_softmin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_softmin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_softmin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_std_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_std_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_std_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_std_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_std_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_std_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_std_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_std_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_std_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_std_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_std_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_var_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_var_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_var_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_var_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_var_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_var_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_var_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_var_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_var_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_var_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_var_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matmul_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matmul_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matmul_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matmul_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matmul_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matmul_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matrix_exp_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matrix_exp_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matrix_exp_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matrix_exp_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matrix_exp_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matrix_exp_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_binary_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_binary_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_binary_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_binary_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_binary_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_binary_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_binary_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_binary_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_binary_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_binary_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_pool2d_with_indices_backward_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_pool2d_with_indices_backward_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_pool2d_with_indices_backward_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_pool2d_with_indices_backward_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_no_dim_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_no_dim_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_no_dim_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_no_dim_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_no_dim_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_no_dim_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_no_dim_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_no_dim_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_no_dim_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_no_dim_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_with_dim_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_with_dim_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_with_dim_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_with_dim_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_with_dim_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_with_dim_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_with_dim_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_with_dim_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_with_dim_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_with_dim_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_maximum_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_maximum_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_maximum_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_maximum_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_maximum_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_maximum_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_maximum_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_maximum_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_maximum_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_maximum_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mean_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mean_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mean_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mean_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mean_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mean_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_median_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_median_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_median_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_median_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_median_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_median_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_median_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_median_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_median_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_binary_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_binary_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_binary_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_binary_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_binary_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_binary_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_binary_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_binary_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_binary_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_binary_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_no_dim_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_no_dim_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_no_dim_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_no_dim_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_no_dim_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_no_dim_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_no_dim_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_no_dim_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_no_dim_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_no_dim_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_with_dim_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_with_dim_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_with_dim_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_with_dim_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_with_dim_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_with_dim_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_with_dim_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_with_dim_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_with_dim_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_with_dim_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_minimum_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_minimum_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_minimum_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_minimum_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_minimum_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_minimum_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_minimum_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_minimum_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_minimum_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_minimum_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mode_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mode_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mode_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mode_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mode_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mode_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mode_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mode_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mode_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mode_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_msort_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_msort_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_msort_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_msort_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_msort_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_msort_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_msort_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_msort_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_msort_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_msort_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_multinomial_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_multinomial_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_multinomial_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_multinomial_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mv_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mv_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mv_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mv_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mv_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mv_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nan_to_num_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nan_to_num_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nan_to_num_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nan_to_num_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nan_to_num_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nan_to_num_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nan_to_num_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nan_to_num_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nan_to_num_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nan_to_num_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmean_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmean_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmean_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmean_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmean_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmean_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmean_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmedian_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmedian_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmedian_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmedian_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmedian_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmedian_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmedian_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmedian_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmedian_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanquantile_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanquantile_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_batch_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_batch_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_batch_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_batch_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_dropout_backward_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_dropout_backward_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_dropout_backward_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_dropout_backward_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_layer_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_layer_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_layer_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_layer_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nextafter_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nextafter_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nextafter_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nextafter_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool1d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool1d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool2d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool2d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool3d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool3d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool1d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool1d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool2d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool2d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool3d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool3d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_alpha_dropout_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_alpha_dropout_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_alpha_dropout_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_alpha_dropout_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool1d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool1d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool1d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool1d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool2d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool2d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool2d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool2d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool3d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool3d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool3d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool3d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_batch_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_batch_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_batch_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_batch_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_batch_norm_without_cudnn_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_batch_norm_without_cudnn_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_bilinear_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_bilinear_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_bilinear_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_bilinear_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_binary_cross_entropy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_binary_cross_entropy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_binary_cross_entropy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_binary_cross_entropy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_binary_cross_entropy_with_logits_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_celu_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_celu_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_celu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_celu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv1d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv1d_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv1d_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv1d_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv1d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv1d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv1d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv2d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv2d_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv2d_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv2d_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv2d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv2d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv2d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv3d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv3d_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv3d_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv3d_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv3d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv3d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv3d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose1d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose1d_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose1d_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose1d_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose1d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose1d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose1d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose2d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose2d_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose2d_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose2d_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose2d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose2d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose2d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose3d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose3d_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose3d_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose3d_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose3d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose3d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose3d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_embedding_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_embedding_loss_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_embedding_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_embedding_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_embedding_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_embedding_loss_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_embedding_loss_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_embedding_loss_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_embedding_loss_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_embedding_loss_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_similarity_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_similarity_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_similarity_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_similarity_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cross_entropy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cross_entropy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cross_entropy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cross_entropy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_ctc_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_ctc_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout2d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout2d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout2d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout2d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout3d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout3d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout3d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout3d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_elu_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_elu_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_elu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_elu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_embedding_bag_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_embedding_bag_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_embedding_bag_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_embedding_bag_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_embedding_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_embedding_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_embedding_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_embedding_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_with_train_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_with_train_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_fractional_max_pool2d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_fractional_max_pool2d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_fractional_max_pool2d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_fractional_max_pool2d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_fractional_max_pool3d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_fractional_max_pool3d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_fractional_max_pool3d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_fractional_max_pool3d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_gaussian_nll_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_gaussian_nll_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_gaussian_nll_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_gaussian_nll_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_gelu_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_gelu_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_gelu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_gelu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_glu_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_glu_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_glu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_glu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_grid_sample_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_grid_sample_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_grid_sample_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_grid_sample_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_group_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_group_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_group_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_group_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardshrink_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardshrink_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardshrink_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardshrink_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardsigmoid_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardsigmoid_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardsigmoid_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardsigmoid_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardswish_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardswish_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardswish_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardswish_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardtanh_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardtanh_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardtanh_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardtanh_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardtanh_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardtanh_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardtanh_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardtanh_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hinge_embedding_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hinge_embedding_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hinge_embedding_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hinge_embedding_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_huber_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_huber_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_huber_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_huber_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_instance_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_instance_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_instance_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_instance_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_area_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_area_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_area_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_area_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_bicubic_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_bicubic_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_bicubic_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_bicubic_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_bilinear_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_bilinear_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_bilinear_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_bilinear_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_linear_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_linear_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_linear_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_linear_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_nearest-exact_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_nearest-exact_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_nearest-exact_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_nearest_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_nearest_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_nearest_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_nearest_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_nearest_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_trilinear_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_trilinear_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_trilinear_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_trilinear_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_kl_div_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_kl_div_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_kl_div_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_kl_div_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_l1_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_l1_loss_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_l1_loss_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_l1_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_l1_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_l1_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_layer_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_layer_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_layer_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_layer_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_leaky_relu_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_leaky_relu_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_leaky_relu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_leaky_relu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_linear_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_linear_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_linear_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_linear_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_linear_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_linear_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_local_response_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_local_response_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_local_response_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_local_response_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_logsigmoid_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_logsigmoid_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_logsigmoid_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_logsigmoid_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_margin_ranking_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_margin_ranking_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_margin_ranking_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_margin_ranking_loss_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_margin_ranking_loss_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_margin_ranking_loss_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_margin_ranking_loss_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_margin_ranking_loss_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool1d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool1d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool1d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool1d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool2d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool2d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool2d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool2d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool3d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool3d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool3d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool3d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool1d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool1d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool1d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool1d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool1d_grad_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool1d_grad_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool1d_grad_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool1d_grad_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool2d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool2d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool2d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool2d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool2d_grad_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool2d_grad_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool2d_grad_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool2d_grad_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool3d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool3d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool3d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool3d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool3d_grad_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool3d_grad_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool3d_grad_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool3d_grad_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_mish_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_mish_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_mish_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_mish_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_mse_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_mse_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_mse_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_mse_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multi_head_attention_forward_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multi_head_attention_forward_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multi_head_attention_forward_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multi_head_attention_forward_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multi_margin_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multi_margin_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multi_margin_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multi_margin_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multilabel_margin_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multilabel_margin_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multilabel_margin_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multilabel_margin_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multilabel_soft_margin_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multilabel_soft_margin_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_nll_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_nll_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_nll_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_nll_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_normalize_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_normalize_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_normalize_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_normalize_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_normalize_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_normalize_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_one_hot_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_reflect_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_reflect_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_reflect_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_reflect_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_reflect_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_reflect_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_reflect_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_reflect_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_reflect_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_reflect_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_reflect_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_negative_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_negative_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_negative_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_negative_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_negative_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_negative_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_negative_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_negative_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_negative_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_negative_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_negative_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pairwise_distance_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pairwise_distance_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pairwise_distance_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pairwise_distance_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pairwise_distance_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pairwise_distance_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pairwise_distance_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pairwise_distance_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pairwise_distance_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pairwise_distance_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pairwise_distance_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pdist_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pdist_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_poisson_nll_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_poisson_nll_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_poisson_nll_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_poisson_nll_loss_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_poisson_nll_loss_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_poisson_nll_loss_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_poisson_nll_loss_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_poisson_nll_loss_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_prelu_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_prelu_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_prelu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_prelu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu6_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu6_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu6_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu6_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu6_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu6_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu6_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu6_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu6_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_rms_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_rms_norm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_rms_norm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_rms_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_rms_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_rms_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_rrelu_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_rrelu_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_rrelu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_rrelu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_scaled_dot_product_attention_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_scaled_dot_product_attention_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_selu_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_selu_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_selu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_selu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_silu_complex_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_silu_complex_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_silu_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_silu_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_silu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_silu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_smooth_l1_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_smooth_l1_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_smooth_l1_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_soft_margin_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_soft_margin_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_soft_margin_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_soft_margin_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_with_dtype_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_with_dtype_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_with_dtype_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_with_dtype_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_with_dtype_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_with_dtype_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_with_dtype_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_with_dtype_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_with_dtype_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_with_dtype_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softplus_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softplus_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softplus_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softplus_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softshrink_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softshrink_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softshrink_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softshrink_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_tanhshrink_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_tanhshrink_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_tanhshrink_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_tanhshrink_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_tanhshrink_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_tanhshrink_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_tanhshrink_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_tanhshrink_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_tanhshrink_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_tanhshrink_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_tanhshrink_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_threshold_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_threshold_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_threshold_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_threshold_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_threshold_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_threshold_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_threshold_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_threshold_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_threshold_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_loss_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_loss_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_loss_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_loss_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_loss_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_loss_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_loss_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_unfold_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_unfold_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_unfold_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_unfold_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_unfold_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_unfold_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_unfold_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_upsample_bilinear_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_upsample_bilinear_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_upsample_bilinear_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_upsample_bilinear_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_upsample_nearest_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_upsample_nearest_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_upsample_nearest_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_upsample_nearest_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_upsample_nearest_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_fro_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_fro_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_fro_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_fro_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_fro_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_fro_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_inf_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_inf_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_inf_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_inf_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_inf_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_inf_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_nuc_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_nuc_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_nuc_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_nuc_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_in_place_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_in_place_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_in_place_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_in_place_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_in_place_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_in_place_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_number_mean_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_number_mean_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_number_mean_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_number_mean_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ormqr_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ormqr_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ormqr_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ormqr_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pca_lowrank_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pca_lowrank_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pca_lowrank_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pca_lowrank_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pinverse_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pinverse_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pinverse_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pinverse_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polar_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polar_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_0_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_0_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_0_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_0_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_0_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_0_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_0_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_0_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_0_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_0_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_1_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_1_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_1_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_1_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_1_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_1_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_1_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_1_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_1_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_1_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_2_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_2_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_2_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_2_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_2_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_2_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_2_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_2_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_2_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_2_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_3_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_3_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_3_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_3_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_3_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_3_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_3_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_3_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_3_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_3_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_4_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_4_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_4_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_4_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_4_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_4_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_4_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_4_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_4_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_4_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_qr_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_qr_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_qr_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_qr_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_quantile_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_quantile_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rad2deg_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rad2deg_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rad2deg_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rad2deg_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rad2deg_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rad2deg_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rad2deg_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rad2deg_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rad2deg_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rad2deg_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rand_like_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rand_like_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rand_like_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rand_like_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rand_like_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rand_like_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rand_like_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_like_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_like_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_like_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_like_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_like_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_like_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_like_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_like_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_like_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_like_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_like_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_like_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_like_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_like_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_like_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_like_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_remainder_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_remainder_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_remainder_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_remainder_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_remainder_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_remainder_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_remainder_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_remainder_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_remainder_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_renorm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_renorm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_renorm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_renorm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_renorm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_renorm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_0_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_0_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_0_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_0_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_3_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_3_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_3_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_3_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_neg_3_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_neg_3_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_neg_3_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_neg_3_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsub_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsub_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsub_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsub_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsub_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsub_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsub_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsub_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsub_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsub_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsub_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amax_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amax_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amax_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amax_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amax_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amin_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amin_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amin_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amin_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amin_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_mean_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_mean_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_mean_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_mean_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_mean_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_mean_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_mean_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_mean_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_mean_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_prod_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_prod_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_prod_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_prod_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_prod_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_prod_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_prod_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_prod_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_prod_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_sum_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_sum_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_sum_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_sum_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_sum_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_sum_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_sum_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_sum_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_sum_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_sum_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_searchsorted_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_searchsorted_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_searchsorted_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_searchsorted_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_searchsorted_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_searchsorted_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_searchsorted_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_searchsorted_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_searchsorted_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_scatter_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_scatter_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_scatter_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_scatter_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_scatter_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_scatter_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_scatter_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_scatter_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_scatter_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_scatter_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sign_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sign_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sign_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sign_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sign_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sign_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sign_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sign_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sign_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sign_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_bartlett_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_bartlett_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_blackman_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_blackman_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_cosine_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_cosine_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_exponential_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_exponential_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_gaussian_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_gaussian_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_general_cosine_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_general_cosine_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_general_hamming_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_general_hamming_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_hamming_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_hamming_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_hann_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_hann_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_kaiser_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_kaiser_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_nuttall_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_nuttall_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signbit_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signbit_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signbit_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signbit_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signbit_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signbit_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signbit_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signbit_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signbit_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signbit_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_scatter_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_scatter_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_scatter_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_scatter_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_scatter_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_scatter_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_scatter_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_scatter_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_scatter_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_scatter_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sort_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sort_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sort_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sort_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sort_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sort_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sort_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sort_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sort_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sort_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sparse_mm_reduce_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sparse_mm_reduce_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sparse_mm_reduce_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sparse_mm_reduce_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sparse_sampled_addmm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sparse_sampled_addmm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sparse_sampled_addmm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sparse_sampled_addmm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_airy_ai_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_airy_ai_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_airy_ai_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_airy_ai_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_airy_ai_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_airy_ai_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_airy_ai_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_airy_ai_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j0_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j0_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j0_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j0_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j0_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j0_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j0_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j0_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j1_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j1_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j1_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j1_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j1_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j1_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j1_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j1_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y0_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y0_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y0_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y0_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y0_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y0_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y0_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y0_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y1_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y1_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y1_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y1_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y1_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y1_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y1_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y1_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_t_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_t_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_t_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_t_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_t_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_t_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_t_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_t_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_u_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_u_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_u_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_u_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_u_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_u_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_u_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_u_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_v_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_v_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_v_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_v_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_v_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_v_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_v_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_v_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_w_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_w_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_w_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_w_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_w_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_w_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_w_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_w_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_entr_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_entr_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_entr_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_entr_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_entr_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_entr_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_entr_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_entr_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_entr_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_entr_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_erfcx_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_erfcx_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_erfcx_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_erfcx_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_erfcx_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_erfcx_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_erfcx_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_erfcx_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_h_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_h_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_h_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_h_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_h_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_h_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_h_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_h_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_he_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_he_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_he_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_he_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_he_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_he_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_he_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_he_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i0e_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i0e_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i0e_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i0e_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i0e_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i0e_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i0e_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i0e_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i0e_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i0e_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1e_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1e_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1e_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1e_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1e_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1e_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1e_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1e_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1e_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1e_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_laguerre_polynomial_l_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_laguerre_polynomial_l_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_laguerre_polynomial_l_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_laguerre_polynomial_l_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_laguerre_polynomial_l_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_laguerre_polynomial_l_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_laguerre_polynomial_l_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_laguerre_polynomial_l_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_legendre_polynomial_p_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_legendre_polynomial_p_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_legendre_polynomial_p_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_legendre_polynomial_p_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_legendre_polynomial_p_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_legendre_polynomial_p_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_legendre_polynomial_p_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_legendre_polynomial_p_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_log_ndtr_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_log_ndtr_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_log_ndtr_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_log_ndtr_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_log_ndtr_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_log_ndtr_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_log_ndtr_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_log_ndtr_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i0_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i0_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i0_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i0_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i0_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i0_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i0_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i0_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i1_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i1_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i1_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i1_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i1_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i1_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i1_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i1_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k0_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k0_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k0_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k0_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k0_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k0_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k0_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k0_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k1_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k1_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k1_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k1_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k1_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k1_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k1_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k1_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtr_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtr_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtr_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtr_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtr_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtr_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtr_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtr_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtr_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtr_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtri_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtri_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtri_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtri_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtri_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtri_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtri_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtri_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k0_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k0_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k0_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k0_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k0_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k0_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k0_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k0_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k1_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k1_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k1_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k1_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k1_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k1_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k1_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k1_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_spherical_bessel_j0_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_spherical_bessel_j0_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_spherical_bessel_j0_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_spherical_bessel_j0_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_spherical_bessel_j0_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_spherical_bessel_j0_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_spherical_bessel_j0_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_spherical_bessel_j0_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_xlog1py_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_xlog1py_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_xlog1py_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_xlog1py_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_xlog1py_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_xlog1py_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_xlog1py_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_xlog1py_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_xlog1py_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_xlog1py_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_zeta_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_zeta_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_zeta_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_zeta_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_zeta_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_zeta_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_zeta_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_zeta_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_unbiased_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_unbiased_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_unbiased_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_unbiased_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_unbiased_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_unbiased_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_unbiased_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_unbiased_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_unbiased_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_unbiased_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_unbiased_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_unbiased_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stft_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stft_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stft_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stft_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_svd_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_svd_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_svd_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_svd_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_svd_lowrank_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_svd_lowrank_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_svd_lowrank_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_svd_lowrank_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensordot_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensordot_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensordot_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensordot_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensordot_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensordot_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_topk_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_topk_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_topk_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_topk_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_topk_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_topk_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_topk_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_topk_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_topk_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch__scaled_mm_cuda_float8_e4m3fn, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__efficient_attention_forward_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__efficient_attention_forward_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__flash_attention_forward_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__safe_softmax_default_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__safe_softmax_default_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__safe_softmax_default_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__safe_softmax_default_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__safe_softmax_default_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__safe_softmax_default_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapezoid_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapezoid_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapezoid_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapezoid_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapezoid_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapezoid_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapezoid_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapezoid_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapezoid_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapezoid_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapezoid_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapz_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapz_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapz_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapz_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapz_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapz_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapz_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapz_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapz_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapz_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapz_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triangular_solve_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triangular_solve_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triangular_solve_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triangular_solve_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_indices_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_indices_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_indices_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_indices_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trunc_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trunc_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trunc_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trunc_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trunc_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trunc_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trunc_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trunc_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trunc_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_uniform_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_uniform_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_uniform_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_uniform_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_uniform_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_uniform_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_consecutive_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_consecutive_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_consecutive_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_consecutive_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_consecutive_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_consecutive_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_consecutive_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_consecutive_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_consecutive_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_consecutive_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_uint16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_uint32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_uint64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unravel_index_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unravel_index_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unravel_index_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unravel_index_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unravel_index_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_unbiased_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_unbiased_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_unbiased_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_unbiased_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_unbiased_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_unbiased_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_unbiased_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_unbiased_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_unbiased_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_unbiased_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_unbiased_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_unbiased_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vdot_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vdot_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vdot_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vdot_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vdot_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vdot_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_complex_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_complex_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_complex_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_real_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_real_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_xlogy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_xlogy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_xlogy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_xlogy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_xlogy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_xlogy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_xlogy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_xlogy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_xlogy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_xlogy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_uint8 2025-08-14T22:47:46.2836718Z 2025-08-14T22:47:49.9108094Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:47:49.9109407Z import pkg_resources 2025-08-14T22:47:49.9703861Z 2025-08-14T22:47:49.9704567Z test_bundled_inputs 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_bundled_inputs_1.1_89ad33cc94e1b2c2_.log 2025-08-14T22:47:49.9708758Z Running 12 items in this shard: test/test_bundled_inputs.py::TestBundledInputs::test_bad_inputs, test/test_bundled_inputs.py::TestBundledInputs::test_dict_args, test/test_bundled_inputs.py::TestBundledInputs::test_double_augment_fail, test/test_bundled_inputs.py::TestBundledInputs::test_double_augment_non_mutator, test/test_bundled_inputs.py::TestBundledInputs::test_double_augment_success, test/test_bundled_inputs.py::TestBundledInputs::test_large_tensor_with_inflation, test/test_bundled_inputs.py::TestBundledInputs::test_multiple_methods_with_inputs, test/test_bundled_inputs.py::TestBundledInputs::test_multiple_methods_with_inputs_both_defined_failure, test/test_bundled_inputs.py::TestBundledInputs::test_multiple_methods_with_inputs_neither_defined_failure, test/test_bundled_inputs.py::TestBundledInputs::test_non_tensors, test/test_bundled_inputs.py::TestBundledInputs::test_rejected_tensors, test/test_bundled_inputs.py::TestBundledInputs::test_single_tensors 2025-08-14T22:47:49.9712281Z 2025-08-14T22:47:50.2936922Z Running profiler/test_execution_trace 1/1 ... [2025-08-14 22:47:50.293189] 2025-08-14T22:47:50.2937393Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:47:50.2939953Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_execution_trace.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:47:50.293635] 2025-08-14T22:47:54.0424261Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:47:54.0425568Z import pkg_resources 2025-08-14T22:47:54.4178314Z Running lazy/test_generator 1/1 ... [2025-08-14 22:47:54.417210] 2025-08-14T22:47:54.4178749Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:47:54.4179958Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'lazy/test_generator.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:47:54.417585] 2025-08-14T22:47:55.0673900Z 2025-08-14T22:47:55.0675190Z profiler/test_execution_trace 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_execution_trace_1.1_a62eea27030c7046_.log 2025-08-14T22:47:55.0680893Z Running 13 items in this shard: test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_alone_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_env_disabled_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_env_enabled_with_kineto_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_env_enabled_with_pt2_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_nested_tensor_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_no_capture_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_record_integral_tensor_data_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_record_integral_tensor_range_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_repeat_in_loop_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_start_stop_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_with_kineto_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_execution_trace_with_pt2_cuda, test/profiler/test_execution_trace.py::TestExecutionTraceCUDA::test_triton_fx_graph_with_et_cuda 2025-08-14T22:47:55.0686329Z 2025-08-14T22:47:59.0910080Z 2025-08-14T22:47:59.0911081Z lazy/test_generator 1/1 was successful, full logs can be found in artifacts with path test/test-reports/lazy.test_generator_1.1_84760bf3b446f478_.log 2025-08-14T22:47:59.0912292Z Running 2 items in this shard: test/lazy/test_generator.py::LazyGeneratorTest::test_generator, test/lazy/test_generator.py::LazyGeneratorTest::test_generator_causes_multiple_compiles 2025-08-14T22:47:59.0913002Z 2025-08-14T22:47:59.2190145Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:47:59.2191866Z import pkg_resources 2025-08-14T22:47:59.5980808Z Running torch_np/numpy_tests/core/test_indexing 1/1 ... [2025-08-14 22:47:59.597535] 2025-08-14T22:47:59.5981353Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:47:59.5982524Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_indexing.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:47:59.597916] 2025-08-14T22:48:03.2125361Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:48:03.2126716Z import pkg_resources 2025-08-14T22:48:03.5916252Z Running test_maskedtensor 1/1 ... [2025-08-14 22:48:03.591125] 2025-08-14T22:48:03.5916649Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:48:03.5919151Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_maskedtensor.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:48:03.591504] 2025-08-14T22:48:04.4715720Z 2025-08-14T22:48:04.4717835Z torch_np/numpy_tests/core/test_indexing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_indexing_1.1_1ede8d91f764b463_.log 2025-08-14T22:48:04.4742517Z Running 67 items in this shard: test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_boolean_assignment_value_mismatch, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_boolean_indexing_list, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_boolean_indexing_onedim, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_boolean_indexing_twodim, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_boolean_shape_mismatch, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_broaderrors_indexing, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_broken_sequence_not_nd_index, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_ellipsis_index, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_ellipsis_index_2, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_empty_fancy_index, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_empty_tuple_index, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_everything_returns_views, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_index_no_array_to_index, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_index_no_floats, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_indexing_array_negative_strides, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_indexing_array_weird_strides, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_memory_order, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_none_index, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_nontuple_ndindex, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_reverse_strides_and_subspace_bufferinit, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_reversed_strides_result_allocation, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_same_kind_index_casting, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_scalar_array_bool, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_single_bool_index, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_single_int_index, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_slicing_no_floats, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_small_regressions, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_too_many_advanced_indices_index2_num_32_original_ndim_1, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_too_many_advanced_indices_index2_num_32_original_ndim_32, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_too_many_advanced_indices_index2_num_40_original_ndim_1, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_too_many_advanced_indices_index2_num_40_original_ndim_32, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_too_many_advanced_indices_index_False_num_32_original_ndim_1, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_too_many_advanced_indices_index_False_num_32_original_ndim_32, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_too_many_advanced_indices_index_False_num_40_original_ndim_1, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_too_many_advanced_indices_index_False_num_40_original_ndim_32, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_too_many_advanced_indices_index_True_num_32_original_ndim_1, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_too_many_advanced_indices_index_True_num_32_original_ndim_32, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_too_many_advanced_indices_index_True_num_40_original_ndim_1, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_too_many_advanced_indices_index_True_num_40_original_ndim_32, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_too_many_fancy_indices_special_case, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_trivial_fancy_not_possible, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_trivial_fancy_out_of_bounds, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_tuple_subclass, test/torch_np/numpy_tests/core/test_indexing.py::TestIndexing::test_uncontiguous_subspace_assignment, test/torch_np/numpy_tests/core/test_indexing.py::TestBroadcastedAssignments::test_broadcast_error_reports_correct_shape_index0, test/torch_np/numpy_tests/core/test_indexing.py::TestBroadcastedAssignments::test_broadcast_error_reports_correct_shape_index1, test/torch_np/numpy_tests/core/test_indexing.py::TestBroadcastedAssignments::test_broadcast_error_reports_correct_shape_index2, test/torch_np/numpy_tests/core/test_indexing.py::TestBroadcastedAssignments::test_broadcast_subspace, test/torch_np/numpy_tests/core/test_indexing.py::TestBroadcastedAssignments::test_index_is_larger, test/torch_np/numpy_tests/core/test_indexing.py::TestBroadcastedAssignments::test_prepend_not_one, test/torch_np/numpy_tests/core/test_indexing.py::TestBroadcastedAssignments::test_prepending_ones, test/torch_np/numpy_tests/core/test_indexing.py::TestBroadcastedAssignments::test_simple_broadcasting_errors, test/torch_np/numpy_tests/core/test_indexing.py::TestFancyIndexingCast::test_boolean_index_cast_assign, test/torch_np/numpy_tests/core/test_indexing.py::TestMultiIndexingAutomated::test_1d, test/torch_np/numpy_tests/core/test_indexing.py::TestMultiIndexingAutomated::test_boolean, test/torch_np/numpy_tests/core/test_indexing.py::TestMultiIndexingAutomated::test_multidim, test/torch_np/numpy_tests/core/test_indexing.py::TestFloatNonIntegerArgument::test_non_integer_argument_errors, test/torch_np/numpy_tests/core/test_indexing.py::TestFloatNonIntegerArgument::test_non_integer_sequence_multiplication, test/torch_np/numpy_tests/core/test_indexing.py::TestFloatNonIntegerArgument::test_reduce_axis_float_index, test/torch_np/numpy_tests/core/test_indexing.py::TestFloatNonIntegerArgument::test_valid_indexing, test/torch_np/numpy_tests/core/test_indexing.py::TestFloatNonIntegerArgument::test_valid_slicing, test/torch_np/numpy_tests/core/test_indexing.py::TestBooleanIndexing::test_bool_as_int_argument_errors, test/torch_np/numpy_tests/core/test_indexing.py::TestBooleanIndexing::test_boolean_indexing_fast_path, test/torch_np/numpy_tests/core/test_indexing.py::TestBooleanIndexing::test_boolean_indexing_weirdness, test/torch_np/numpy_tests/core/test_indexing.py::TestArrayToIndexDeprecation::test_array_to_index_error, test/torch_np/numpy_tests/core/test_indexing.py::TestNonIntegerArrayLike::test_basic, test/torch_np/numpy_tests/core/test_indexing.py::TestMultipleEllipsisError::test_basic 2025-08-14T22:48:04.4768090Z 2025-08-14T22:48:08.5986444Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:48:08.5987777Z import pkg_resources 2025-08-14T22:48:08.9860506Z Running test_tensorboard 1/1 ... [2025-08-14 22:48:08.985634] 2025-08-14T22:48:08.9860956Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:48:08.9863621Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_tensorboard.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:48:08.986025] 2025-08-14T22:48:11.0711343Z 2025-08-14T22:48:11.0712322Z test_maskedtensor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_maskedtensor_1.1_c46245c59626dad0_.log 2025-08-14T22:48:11.1000611Z Running 958 items in this shard: test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn0, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn1, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn10, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn11, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn12, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn13, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn14, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn15, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn16, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn17, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn18, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn19, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn2, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn20, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn21, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn22, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn23, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn24, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn25, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn26, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn27, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn28, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn29, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn3, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn30, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn31, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn32, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn33, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn34, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn35, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn36, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn37, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn38, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn39, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn4, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn40, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn41, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn42, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn43, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn44, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn45, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn46, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn47, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn48, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn49, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn5, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn50, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn51, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn52, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn53, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn54, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn55, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn56, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn57, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn6, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn7, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn8, test/test_maskedtensor.py::TestUnary::test_inplace_unary_fn9, test/test_maskedtensor.py::TestUnary::test_unary_fn0, test/test_maskedtensor.py::TestUnary::test_unary_fn1, test/test_maskedtensor.py::TestUnary::test_unary_fn10, test/test_maskedtensor.py::TestUnary::test_unary_fn11, test/test_maskedtensor.py::TestUnary::test_unary_fn12, test/test_maskedtensor.py::TestUnary::test_unary_fn13, test/test_maskedtensor.py::TestUnary::test_unary_fn14, test/test_maskedtensor.py::TestUnary::test_unary_fn15, test/test_maskedtensor.py::TestUnary::test_unary_fn16, test/test_maskedtensor.py::TestUnary::test_unary_fn17, test/test_maskedtensor.py::TestUnary::test_unary_fn18, test/test_maskedtensor.py::TestUnary::test_unary_fn19, test/test_maskedtensor.py::TestUnary::test_unary_fn2, test/test_maskedtensor.py::TestUnary::test_unary_fn20, test/test_maskedtensor.py::TestUnary::test_unary_fn21, test/test_maskedtensor.py::TestUnary::test_unary_fn22, test/test_maskedtensor.py::TestUnary::test_unary_fn23, test/test_maskedtensor.py::TestUnary::test_unary_fn24, test/test_maskedtensor.py::TestUnary::test_unary_fn25, test/test_maskedtensor.py::TestUnary::test_unary_fn26, test/test_maskedtensor.py::TestUnary::test_unary_fn27, test/test_maskedtensor.py::TestUnary::test_unary_fn28, test/test_maskedtensor.py::TestUnary::test_unary_fn29, test/test_maskedtensor.py::TestUnary::test_unary_fn3, test/test_maskedtensor.py::TestUnary::test_unary_fn30, test/test_maskedtensor.py::TestUnary::test_unary_fn31, test/test_maskedtensor.py::TestUnary::test_unary_fn32, test/test_maskedtensor.py::TestUnary::test_unary_fn33, test/test_maskedtensor.py::TestUnary::test_unary_fn34, test/test_maskedtensor.py::TestUnary::test_unary_fn35, test/test_maskedtensor.py::TestUnary::test_unary_fn36, test/test_maskedtensor.py::TestUnary::test_unary_fn37, test/test_maskedtensor.py::TestUnary::test_unary_fn38, test/test_maskedtensor.py::TestUnary::test_unary_fn39, test/test_maskedtensor.py::TestUnary::test_unary_fn4, test/test_maskedtensor.py::TestUnary::test_unary_fn40, test/test_maskedtensor.py::TestUnary::test_unary_fn41, test/test_maskedtensor.py::TestUnary::test_unary_fn42, test/test_maskedtensor.py::TestUnary::test_unary_fn43, test/test_maskedtensor.py::TestUnary::test_unary_fn44, test/test_maskedtensor.py::TestUnary::test_unary_fn45, test/test_maskedtensor.py::TestUnary::test_unary_fn46, test/test_maskedtensor.py::TestUnary::test_unary_fn47, test/test_maskedtensor.py::TestUnary::test_unary_fn48, test/test_maskedtensor.py::TestUnary::test_unary_fn49, test/test_maskedtensor.py::TestUnary::test_unary_fn5, test/test_maskedtensor.py::TestUnary::test_unary_fn50, test/test_maskedtensor.py::TestUnary::test_unary_fn51, test/test_maskedtensor.py::TestUnary::test_unary_fn52, test/test_maskedtensor.py::TestUnary::test_unary_fn53, test/test_maskedtensor.py::TestUnary::test_unary_fn54, test/test_maskedtensor.py::TestUnary::test_unary_fn55, test/test_maskedtensor.py::TestUnary::test_unary_fn56, test/test_maskedtensor.py::TestUnary::test_unary_fn57, test/test_maskedtensor.py::TestUnary::test_unary_fn58, test/test_maskedtensor.py::TestUnary::test_unary_fn59, test/test_maskedtensor.py::TestUnary::test_unary_fn6, test/test_maskedtensor.py::TestUnary::test_unary_fn60, test/test_maskedtensor.py::TestUnary::test_unary_fn61, test/test_maskedtensor.py::TestUnary::test_unary_fn7, test/test_maskedtensor.py::TestUnary::test_unary_fn8, test/test_maskedtensor.py::TestUnary::test_unary_fn9, test/test_maskedtensor.py::TestBinary::test_binary_fn0, test/test_maskedtensor.py::TestBinary::test_binary_fn1, test/test_maskedtensor.py::TestBinary::test_binary_fn10, test/test_maskedtensor.py::TestBinary::test_binary_fn11, test/test_maskedtensor.py::TestBinary::test_binary_fn12, test/test_maskedtensor.py::TestBinary::test_binary_fn13, test/test_maskedtensor.py::TestBinary::test_binary_fn14, test/test_maskedtensor.py::TestBinary::test_binary_fn15, test/test_maskedtensor.py::TestBinary::test_binary_fn16, test/test_maskedtensor.py::TestBinary::test_binary_fn17, test/test_maskedtensor.py::TestBinary::test_binary_fn18, test/test_maskedtensor.py::TestBinary::test_binary_fn19, test/test_maskedtensor.py::TestBinary::test_binary_fn2, test/test_maskedtensor.py::TestBinary::test_binary_fn20, test/test_maskedtensor.py::TestBinary::test_binary_fn21, test/test_maskedtensor.py::TestBinary::test_binary_fn22, test/test_maskedtensor.py::TestBinary::test_binary_fn23, test/test_maskedtensor.py::TestBinary::test_binary_fn24, test/test_maskedtensor.py::TestBinary::test_binary_fn25, test/test_maskedtensor.py::TestBinary::test_binary_fn26, test/test_maskedtensor.py::TestBinary::test_binary_fn27, test/test_maskedtensor.py::TestBinary::test_binary_fn28, test/test_maskedtensor.py::TestBinary::test_binary_fn29, test/test_maskedtensor.py::TestBinary::test_binary_fn3, test/test_maskedtensor.py::TestBinary::test_binary_fn30, test/test_maskedtensor.py::TestBinary::test_binary_fn31, test/test_maskedtensor.py::TestBinary::test_binary_fn32, test/test_maskedtensor.py::TestBinary::test_binary_fn33, test/test_maskedtensor.py::TestBinary::test_binary_fn34, test/test_maskedtensor.py::TestBinary::test_binary_fn35, test/test_maskedtensor.py::TestBinary::test_binary_fn4, test/test_maskedtensor.py::TestBinary::test_binary_fn5, test/test_maskedtensor.py::TestBinary::test_binary_fn6, test/test_maskedtensor.py::TestBinary::test_binary_fn7, test/test_maskedtensor.py::TestBinary::test_binary_fn8, test/test_maskedtensor.py::TestBinary::test_binary_fn9, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn0, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn1, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn10, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn11, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn12, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn13, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn14, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn15, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn16, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn17, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn18, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn19, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn2, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn20, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn21, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn22, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn23, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn24, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn25, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn26, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn27, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn28, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn29, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn3, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn4, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn5, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn6, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn7, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn8, test/test_maskedtensor.py::TestBinary::test_inplace_binary_fn9, test/test_maskedtensor.py::TestBinary::test_masks_match_fn_name_add, test/test_maskedtensor.py::TestBinary::test_masks_match_fn_name_add_, test/test_maskedtensor.py::TestReductions::test__is_any_true, test/test_maskedtensor.py::TestReductions::test__is_any_true_false, test/test_maskedtensor.py::TestReductions::test_all, test/test_maskedtensor.py::TestReductions::test_amax, test/test_maskedtensor.py::TestReductions::test_amax_grad, test/test_maskedtensor.py::TestReductions::test_amin, test/test_maskedtensor.py::TestReductions::test_amin_grad, test/test_maskedtensor.py::TestReductions::test_any_true_dtype, test/test_maskedtensor.py::TestReductions::test_backward, test/test_maskedtensor.py::TestReductions::test_grad_dtype, test/test_maskedtensor.py::TestReductions::test_max_not_implemented, test/test_maskedtensor.py::TestReductions::test_mean, test/test_maskedtensor.py::TestReductions::test_mean_dim_grad, test/test_maskedtensor.py::TestReductions::test_mean_grad_case_1a, test/test_maskedtensor.py::TestReductions::test_mean_grad_case_1b, test/test_maskedtensor.py::TestReductions::test_mean_grad_case_1c, test/test_maskedtensor.py::TestReductions::test_mean_grad_case_1d, test/test_maskedtensor.py::TestReductions::test_mean_grad_case_1e, test/test_maskedtensor.py::TestReductions::test_mean_grad_case_1f, test/test_maskedtensor.py::TestReductions::test_prod, test/test_maskedtensor.py::TestReductions::test_prod_grad, test/test_maskedtensor.py::TestReductions::test_sum, test/test_maskedtensor.py::TestReductions::test_sum_grad, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_add_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_add_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_add_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_add_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_add_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_add_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_add_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_add_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_add_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_atan2_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_atan2_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_atan2_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_atan2_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_atan2_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_atan2_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_atan2_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_atan2_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_atan2_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_floor_rounding_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_floor_rounding_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_floor_rounding_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_floor_rounding_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_floor_rounding_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_floor_rounding_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_floor_rounding_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_floor_rounding_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_floor_rounding_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_no_rounding_mode_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_no_rounding_mode_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_no_rounding_mode_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_no_rounding_mode_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_no_rounding_mode_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_no_rounding_mode_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_no_rounding_mode_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_no_rounding_mode_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_no_rounding_mode_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_trunc_rounding_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_trunc_rounding_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_trunc_rounding_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_trunc_rounding_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_trunc_rounding_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_trunc_rounding_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_trunc_rounding_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_trunc_rounding_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_div_trunc_rounding_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_eq_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_eq_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_eq_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_eq_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_eq_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_eq_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_eq_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_eq_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_eq_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_floor_divide_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_floor_divide_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_floor_divide_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_floor_divide_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_floor_divide_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_floor_divide_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_floor_divide_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_floor_divide_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_floor_divide_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmax_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmax_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmax_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmax_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmax_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmax_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmax_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmax_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmax_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmin_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmin_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmin_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmin_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmin_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmin_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmin_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmin_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmin_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmod_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmod_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmod_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmod_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmod_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmod_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmod_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmod_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_fmod_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_ge_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_ge_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_ge_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_ge_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_ge_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_ge_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_ge_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_ge_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_ge_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_gt_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_gt_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_gt_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_gt_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_gt_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_gt_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_gt_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_gt_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_gt_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_le_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_le_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_le_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_le_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_le_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_le_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_le_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_le_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_le_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_logaddexp_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_logaddexp_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_logaddexp_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_logaddexp_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_logaddexp_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_logaddexp_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_logaddexp_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_logaddexp_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_logaddexp_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_lt_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_lt_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_lt_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_lt_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_lt_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_lt_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_lt_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_lt_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_lt_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_maximum_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_maximum_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_maximum_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_maximum_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_maximum_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_maximum_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_maximum_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_maximum_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_maximum_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_minimum_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_minimum_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_minimum_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_minimum_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_minimum_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_minimum_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_minimum_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_minimum_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_minimum_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_mul_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_mul_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_mul_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_mul_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_mul_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_mul_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_mul_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_mul_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_mul_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_ne_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_ne_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_ne_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_ne_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_ne_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_ne_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_ne_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_ne_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_ne_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_nextafter_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_nextafter_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_nextafter_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_nextafter_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_nextafter_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_nextafter_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_nextafter_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_nextafter_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_nextafter_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_remainder_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_remainder_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_remainder_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_remainder_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_remainder_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_remainder_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_remainder_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_remainder_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_remainder_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_sub_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_sub_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_sub_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_sub_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_sub_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_sub_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_sub_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_sub_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_sub_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_true_divide_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_true_divide_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_true_divide_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_true_divide_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_true_divide_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_true_divide_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_true_divide_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_true_divide_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_binary_core_true_divide_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_amax_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_amax_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_amax_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_amax_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_amax_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_amax_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_amax_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_amax_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_amax_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_amin_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_amin_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_amin_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_amin_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_amin_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_amin_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_amin_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_amin_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_amin_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_argmax_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_argmax_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_argmax_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_argmax_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_argmax_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_argmax_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_argmax_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_argmax_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_argmax_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_argmin_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_argmin_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_argmin_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_argmin_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_argmin_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_argmin_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_argmin_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_argmin_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_argmin_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_prod_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_prod_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_prod_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_prod_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_prod_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_prod_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_prod_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_prod_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_prod_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_sum_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_sum_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_sum_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_sum_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_sum_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_sum_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_sum_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_sum_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_reduction_all_sum_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_abs_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_abs_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_abs_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_abs_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_abs_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_abs_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_abs_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_abs_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_abs_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_acos_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_acos_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_acos_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_acos_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_acos_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_acos_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_acos_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_acos_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_acos_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_acosh_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_acosh_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_acosh_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_acosh_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_acosh_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_acosh_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_acosh_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_acosh_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_acosh_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_angle_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_angle_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_angle_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_angle_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_angle_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_angle_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_asin_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_asin_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_asin_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_asin_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_asin_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_asin_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_asin_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_asin_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_asin_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_asinh_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_asinh_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_asinh_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_asinh_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_asinh_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_asinh_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_asinh_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_asinh_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_asinh_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_atan_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_atan_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_atan_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_atan_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_atan_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_atan_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_atan_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_atan_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_atan_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_atanh_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_atanh_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_atanh_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_atanh_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_atanh_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_atanh_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_atanh_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_atanh_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_atanh_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_ceil_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_ceil_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_ceil_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_ceil_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_ceil_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_ceil_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_ceil_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_ceil_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_ceil_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_conj_physical_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_conj_physical_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_conj_physical_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_conj_physical_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_conj_physical_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_conj_physical_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_conj_physical_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_conj_physical_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_conj_physical_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_cos_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_cos_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_cos_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_cos_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_cos_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_cos_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_cos_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_cos_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_cos_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_cosh_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_cosh_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_cosh_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_cosh_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_cosh_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_cosh_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_cosh_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_cosh_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_cosh_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_deg2rad_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_deg2rad_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_deg2rad_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_deg2rad_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_deg2rad_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_deg2rad_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_deg2rad_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_deg2rad_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_deg2rad_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_digamma_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_digamma_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_digamma_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_digamma_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_digamma_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_digamma_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_digamma_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_digamma_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_digamma_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erf_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erf_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erf_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erf_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erf_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erf_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erf_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erf_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erf_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erfc_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erfc_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erfc_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erfc_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erfc_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erfc_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erfc_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erfc_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erfc_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erfinv_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erfinv_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erfinv_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erfinv_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erfinv_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erfinv_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erfinv_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erfinv_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_erfinv_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_exp2_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_exp2_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_exp2_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_exp2_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_exp2_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_exp2_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_exp2_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_exp2_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_exp2_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_exp_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_exp_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_exp_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_exp_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_exp_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_exp_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_exp_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_exp_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_exp_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_expm1_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_expm1_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_expm1_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_expm1_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_expm1_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_expm1_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_expm1_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_expm1_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_expm1_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_floor_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_floor_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_floor_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_floor_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_floor_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_floor_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_floor_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_floor_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_floor_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_frac_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_frac_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_frac_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_frac_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_frac_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_frac_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_frac_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_frac_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_frac_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_i0_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_i0_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_i0_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_i0_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_i0_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_i0_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_i0_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_i0_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_i0_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_isnan_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_isnan_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_isnan_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_isnan_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_isnan_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_isnan_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_isnan_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_isnan_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_isnan_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_lgamma_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_lgamma_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_lgamma_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_lgamma_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_lgamma_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_lgamma_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_lgamma_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_lgamma_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_lgamma_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log10_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log10_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log10_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log10_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log10_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log10_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log10_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log10_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log10_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log1p_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log1p_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log1p_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log1p_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log1p_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log1p_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log1p_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log1p_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log1p_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log2_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log2_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log2_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log2_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log2_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log2_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log2_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log2_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log2_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_log_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_logit_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_logit_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_logit_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_logit_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_logit_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_logit_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_logit_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_logit_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_logit_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_nan_to_num_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_nan_to_num_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_nan_to_num_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_nan_to_num_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_nan_to_num_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_nan_to_num_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_nan_to_num_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_nan_to_num_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_nan_to_num_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_neg_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_neg_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_neg_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_neg_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_neg_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_neg_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_neg_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_neg_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_neg_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_positive_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_positive_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_positive_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_positive_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_positive_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_positive_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_positive_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_positive_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_positive_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_rad2deg_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_rad2deg_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_rad2deg_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_rad2deg_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_rad2deg_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_rad2deg_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_rad2deg_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_rad2deg_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_rad2deg_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_reciprocal_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_reciprocal_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_reciprocal_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_reciprocal_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_reciprocal_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_reciprocal_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_reciprocal_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_reciprocal_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_reciprocal_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_0_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_0_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_0_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_0_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_0_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_0_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_0_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_0_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_0_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_3_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_3_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_3_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_3_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_3_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_3_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_3_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_3_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_3_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_neg_3_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_neg_3_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_neg_3_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_neg_3_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_neg_3_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_neg_3_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_neg_3_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_neg_3_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_decimals_neg_3_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_round_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_rsqrt_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_rsqrt_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_rsqrt_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_rsqrt_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_rsqrt_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_rsqrt_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_rsqrt_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_rsqrt_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_rsqrt_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sgn_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sgn_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sgn_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sgn_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sgn_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sgn_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sgn_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sgn_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sgn_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sigmoid_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sigmoid_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sigmoid_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sigmoid_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sigmoid_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sigmoid_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sigmoid_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sigmoid_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sigmoid_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sign_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sign_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sign_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sign_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sign_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sign_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sign_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sign_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sign_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_signbit_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_signbit_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_signbit_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_signbit_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_signbit_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_signbit_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_signbit_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_signbit_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_signbit_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sin_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sin_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sin_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sin_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sin_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sin_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sin_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sin_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sin_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sinc_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sinc_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sinc_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sinc_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sinc_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sinc_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sinc_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sinc_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sinc_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sinh_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sinh_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sinh_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sinh_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sinh_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sinh_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sinh_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sinh_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sinh_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sqrt_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sqrt_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sqrt_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sqrt_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sqrt_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sqrt_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sqrt_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sqrt_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_sqrt_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_square_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_square_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_square_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_square_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_square_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_square_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_square_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_square_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_square_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_tan_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_tan_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_tan_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_tan_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_tan_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_tan_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_tan_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_tan_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_tan_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_tanh_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_tanh_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_tanh_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_tanh_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_tanh_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_tanh_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_tanh_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_tanh_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_tanh_layout2_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_trunc_layout0_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_trunc_layout0_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_trunc_layout0_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_trunc_layout1_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_trunc_layout1_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_trunc_layout1_cuda_float64, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_trunc_layout2_cuda_float16, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_trunc_layout2_cuda_float32, test/test_maskedtensor.py::TestOperatorsCUDA::test_unary_core_trunc_layout2_cuda_float64, test/test_maskedtensor.py::TestBasicsCUDA::test_add_cuda, test/test_maskedtensor.py::TestBasicsCUDA::test_contiguous_cuda, test/test_maskedtensor.py::TestBasicsCUDA::test_diff_dim_cuda, test/test_maskedtensor.py::TestBasicsCUDA::test_diff_layouts_cuda, test/test_maskedtensor.py::TestBasicsCUDA::test_diff_sizes_cuda, test/test_maskedtensor.py::TestBasicsCUDA::test_grad_warning_cuda, test/test_maskedtensor.py::TestBasicsCUDA::test_invalid_sparse_coo_values_cuda, test/test_maskedtensor.py::TestBasicsCUDA::test_invalid_sparse_csr_values_cuda, test/test_maskedtensor.py::TestBasicsCUDA::test_invalid_sparse_layout_cuda, test/test_maskedtensor.py::TestBasicsCUDA::test_invalid_tensor_inputs_cuda, test/test_maskedtensor.py::TestBasicsCUDA::test_nn_unfold_cuda, test/test_maskedtensor.py::TestBasicsCUDA::test_softmax_cuda, test/test_maskedtensor.py::TestBasicsCUDA::test_stack_cuda, test/test_maskedtensor.py::TestBasicsCUDA::test_to_dense_and_sparse_coo_cuda, test/test_maskedtensor.py::TestBasicsCUDA::test_to_dense_and_sparse_csr_cuda, test/test_maskedtensor.py::TestBasicsCUDA::test_to_dense_cuda, test/test_maskedtensor.py::TestBasicsCUDA::test_to_device_cuda, test/test_maskedtensor.py::TestBasicsCUDA::test_to_dtype_cuda, test/test_maskedtensor.py::TestBasicsCUDA::test_to_sparse_cuda, test/test_maskedtensor.py::TestBasicsCUDA::test_unfold_cuda, test/test_maskedtensor.py::TestBasicsCUDA::test_where_cuda 2025-08-14T22:48:11.1276984Z 2025-08-14T22:48:15.2335553Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:48:15.2336884Z import pkg_resources 2025-08-14T22:48:15.6081951Z Running torch_np/numpy_tests/lib/test_type_check 1/1 ... [2025-08-14 22:48:15.607664] 2025-08-14T22:48:15.6082453Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:48:15.6084036Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/lib/test_type_check.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:48:15.608045] 2025-08-14T22:48:20.5343643Z 2025-08-14T22:48:20.5345455Z torch_np/numpy_tests/lib/test_type_check 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.lib.test_type_check_1.1_01c7801ce54ecfce_.log 2025-08-14T22:48:20.5370755Z Running 50 items in this shard: test/torch_np/numpy_tests/lib/test_type_check.py::TestCommonType::test_basic, test/torch_np/numpy_tests/lib/test_type_check.py::TestMintypecode::test_default_1, test/torch_np/numpy_tests/lib/test_type_check.py::TestMintypecode::test_default_2, test/torch_np/numpy_tests/lib/test_type_check.py::TestMintypecode::test_default_3, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsscalar::test_basic, test/torch_np/numpy_tests/lib/test_type_check.py::TestReal::test_cmplx, test/torch_np/numpy_tests/lib/test_type_check.py::TestReal::test_real, test/torch_np/numpy_tests/lib/test_type_check.py::TestImag::test_cmplx, test/torch_np/numpy_tests/lib/test_type_check.py::TestImag::test_real, test/torch_np/numpy_tests/lib/test_type_check.py::TestIscomplex::test_fail, test/torch_np/numpy_tests/lib/test_type_check.py::TestIscomplex::test_pass, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsreal::test_fail, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsreal::test_isreal_real, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsreal::test_pass, test/torch_np/numpy_tests/lib/test_type_check.py::TestIscomplexobj::test_basic, test/torch_np/numpy_tests/lib/test_type_check.py::TestIscomplexobj::test_list, test/torch_np/numpy_tests/lib/test_type_check.py::TestIscomplexobj::test_scalar, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsrealobj::test_basic, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsnan::test_complex, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsnan::test_complex1, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsnan::test_goodvalues, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsnan::test_ind, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsnan::test_integer, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsnan::test_neginf, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsnan::test_posinf, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsfinite::test_complex, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsfinite::test_complex1, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsfinite::test_goodvalues, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsfinite::test_ind, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsfinite::test_integer, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsfinite::test_neginf, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsfinite::test_posinf, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsinf::test_goodvalues, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsinf::test_ind, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsinf::test_neginf, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsinf::test_neginf_scalar, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsinf::test_posinf, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsinf::test_posinf_scalar, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsposinf::test_generic, test/torch_np/numpy_tests/lib/test_type_check.py::TestIsneginf::test_generic, test/torch_np/numpy_tests/lib/test_type_check.py::TestNanToNum::test_array, test/torch_np/numpy_tests/lib/test_type_check.py::TestNanToNum::test_complex_bad, test/torch_np/numpy_tests/lib/test_type_check.py::TestNanToNum::test_complex_bad2, test/torch_np/numpy_tests/lib/test_type_check.py::TestNanToNum::test_complex_good, test/torch_np/numpy_tests/lib/test_type_check.py::TestNanToNum::test_do_not_rewrite_previous_keyword, test/torch_np/numpy_tests/lib/test_type_check.py::TestNanToNum::test_float, test/torch_np/numpy_tests/lib/test_type_check.py::TestNanToNum::test_generic, test/torch_np/numpy_tests/lib/test_type_check.py::TestNanToNum::test_integer, test/torch_np/numpy_tests/lib/test_type_check.py::TestRealIfClose::test_basic, test/torch_np/numpy_tests/lib/test_type_check.py::TestArrayConversion::test_asfarray 2025-08-14T22:48:20.5394163Z 2025-08-14T22:48:24.7795792Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:48:24.7797598Z import pkg_resources 2025-08-14T22:48:25.1561616Z Running lazy/test_ts_opinfo 1/1 ... [2025-08-14 22:48:25.155653] 2025-08-14T22:48:25.1562139Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:48:25.1563613Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'lazy/test_ts_opinfo.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:48:25.156026] 2025-08-14T22:48:31.1364525Z 2025-08-14T22:48:31.1365281Z lazy/test_ts_opinfo 1/1 was successful, full logs can be found in artifacts with path test/test-reports/lazy.test_ts_opinfo_1.1_447e9088e9ec6a5f_.log 2025-08-14T22:48:31.1367692Z Running 5 items in this shard: test/lazy/test_ts_opinfo.py::TestLazyTensor::testConvolutionBackward, test/lazy/test_ts_opinfo.py::TestLazyTensor::test_tensor_ctr, test/lazy/test_ts_opinfo.py::TestLazyTensor::test_view_mark_step_preserved, test/lazy/test_ts_opinfo.py::TestLazyDynamicOps::test_adaptiveavgpool3d_dynamic, test/lazy/test_ts_opinfo.py::TestLazyDynamicOps::test_nonzero_dynamic 2025-08-14T22:48:31.1369130Z 2025-08-14T22:48:35.3729727Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:48:35.3732013Z import pkg_resources 2025-08-14T22:48:35.7526931Z Running test_mkldnn_fusion 1/1 ... [2025-08-14 22:48:35.752014] 2025-08-14T22:48:35.7527568Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:48:35.7529267Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_mkldnn_fusion.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:48:35.752424] 2025-08-14T22:48:36.9170681Z 2025-08-14T22:48:36.9171603Z test_tensorboard 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_tensorboard_1.1_42d64527d9396035_.log 2025-08-14T22:48:36.9186058Z Running 50 items in this shard: test/test_tensorboard.py::TestTensorBoardPyTorchNumpy::test_pytorch_autograd_np, test/test_tensorboard.py::TestTensorBoardPyTorchNumpy::test_pytorch_histogram, test/test_tensorboard.py::TestTensorBoardPyTorchNumpy::test_pytorch_histogram_raw, test/test_tensorboard.py::TestTensorBoardPyTorchNumpy::test_pytorch_np, test/test_tensorboard.py::TestTensorBoardPyTorchNumpy::test_pytorch_write, test/test_tensorboard.py::TestTensorBoardUtils::test_convert_to_HWC_dtype_remains_same, test/test_tensorboard.py::TestTensorBoardUtils::test_numpy_vid_uint8, test/test_tensorboard.py::TestTensorBoardUtils::test_prepare_video, test/test_tensorboard.py::TestTensorBoardUtils::test_to_HWC, test/test_tensorboard.py::TestTensorBoardWriter::test_writer, test/test_tensorboard.py::TestTensorBoardSummaryWriter::test_pathlib, test/test_tensorboard.py::TestTensorBoardSummaryWriter::test_summary_writer_close, test/test_tensorboard.py::TestTensorBoardSummaryWriter::test_summary_writer_ctx, test/test_tensorboard.py::TestTensorBoardEmbedding::test_embedding, test/test_tensorboard.py::TestTensorBoardEmbedding::test_embedding_64, test/test_tensorboard.py::TestTensorBoardSummary::test_audio, test/test_tensorboard.py::TestTensorBoardSummary::test_custom_scalars, test/test_tensorboard.py::TestTensorBoardSummary::test_empty_input, test/test_tensorboard.py::TestTensorBoardSummary::test_float32_image, test/test_tensorboard.py::TestTensorBoardSummary::test_histogram_auto, test/test_tensorboard.py::TestTensorBoardSummary::test_histogram_doane, test/test_tensorboard.py::TestTensorBoardSummary::test_histogram_fd, test/test_tensorboard.py::TestTensorBoardSummary::test_image_with_3_channel_batched, test/test_tensorboard.py::TestTensorBoardSummary::test_image_with_boxes, test/test_tensorboard.py::TestTensorBoardSummary::test_image_with_one_channel, test/test_tensorboard.py::TestTensorBoardSummary::test_image_with_one_channel_batched, test/test_tensorboard.py::TestTensorBoardSummary::test_image_without_channel, test/test_tensorboard.py::TestTensorBoardSummary::test_list_input, test/test_tensorboard.py::TestTensorBoardSummary::test_mesh, test/test_tensorboard.py::TestTensorBoardSummary::test_scalar_new_style, test/test_tensorboard.py::TestTensorBoardSummary::test_text, test/test_tensorboard.py::TestTensorBoardSummary::test_uint8_image, test/test_tensorboard.py::TestTensorBoardSummary::test_video, test/test_tensorboard.py::TestTensorBoardPytorchGraph::test_mlp_graph, test/test_tensorboard.py::TestTensorBoardPytorchGraph::test_nested_nn_squential, test/test_tensorboard.py::TestTensorBoardPytorchGraph::test_pytorch_graph, test/test_tensorboard.py::TestTensorBoardPytorchGraph::test_pytorch_graph_dict_input, test/test_tensorboard.py::TestTensorBoardPytorchGraph::test_torchvision_smoke, test/test_tensorboard.py::TestTensorBoardPytorchGraph::test_wrong_input_size, test/test_tensorboard.py::TestTensorBoardFigure::test_figure, test/test_tensorboard.py::TestTensorBoardFigure::test_figure_list, test/test_tensorboard.py::TestTensorBoardNumpy::test_pytorch_np_expect_fail, test/test_tensorboard.py::TestTensorBoardNumpy::test_scalar, test/test_tensorboard.py::TestTensorProtoSummary::test_complex_tensor_proto, test/test_tensorboard.py::TestTensorProtoSummary::test_empty_tensor_proto, test/test_tensorboard.py::TestTensorProtoSummary::test_float_tensor_proto, test/test_tensorboard.py::TestTensorProtoSummary::test_half_tensor_proto_bfloat16_proto_type_14, test/test_tensorboard.py::TestTensorProtoSummary::test_half_tensor_proto_float16_proto_type_19, test/test_tensorboard.py::TestTensorProtoSummary::test_int_tensor_proto, test/test_tensorboard.py::TestTensorProtoSummary::test_scalar_tensor_proto 2025-08-14T22:48:36.9199847Z 2025-08-14T22:48:40.9721537Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:48:40.9722880Z import pkg_resources 2025-08-14T22:48:41.3470260Z Running benchmark_utils/test_benchmark_utils 1/1 ... [2025-08-14 22:48:41.346627] 2025-08-14T22:48:41.3470815Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:48:41.3474013Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'benchmark_utils/test_benchmark_utils.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:48:41.347003] 2025-08-14T22:50:03.3367550Z 2025-08-14T22:50:03.3368553Z test_mkldnn_fusion 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_mkldnn_fusion_1.1_f97910fb5be574f5_.log 2025-08-14T22:50:03.3372828Z Running 8 items in this shard: test/test_mkldnn_fusion.py::TestMkldnnFusion::test_conv_binary_fusion_ops, test/test_mkldnn_fusion.py::TestMkldnnFusion::test_conv_transpose_unary_fusion_ops, test/test_mkldnn_fusion.py::TestMkldnnFusion::test_conv_unary_fusion_nnc, test/test_mkldnn_fusion.py::TestMkldnnFusion::test_conv_unary_fusion_ops, test/test_mkldnn_fusion.py::TestMkldnnFusion::test_linear_binary_fusion_ops, test/test_mkldnn_fusion.py::TestMkldnnFusion::test_linear_unary_fusion_ops, test/test_mkldnn_fusion.py::TestMkldnnFusion::test_single_conv, test/test_mkldnn_fusion.py::TestMkldnnFusion::test_unsupported_conv 2025-08-14T22:50:03.3374992Z 2025-08-14T22:50:07.4366336Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:50:07.4368793Z import pkg_resources 2025-08-14T22:50:07.8253843Z Running optim/test_swa_utils 1/1 ... [2025-08-14 22:50:07.824997] 2025-08-14T22:50:07.8254273Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:50:07.8257566Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'optim/test_swa_utils.py', '-m', 'not serial', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:50:07.825417] 2025-08-14T22:50:11.9485846Z 2025-08-14T22:50:11.9486995Z optim/test_swa_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/optim.test_swa_utils_1.1_0f26b337401dba43_.log 2025-08-14T22:50:11.9487836Z 2025-08-14T22:50:16.0097414Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:50:16.0098792Z import pkg_resources 2025-08-14T22:50:16.3866049Z Running test_sparse_csr 2/2 ... [2025-08-14 22:50:16.386201] 2025-08-14T22:50:16.3866589Z SCRIBE_GRAPHQL_ACCESS_TOKEN is set 2025-08-14T22:50:16.3869333Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_sparse_csr.py', '-m', 'not serial', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-08-14 22:50:16.386590] 2025-08-14T22:54:18.3127311Z 2025-08-14T22:54:18.3128163Z benchmark_utils/test_benchmark_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/benchmark_utils.test_benchmark_utils_1.1_60a5ea44053c058e_.log 2025-08-14T22:54:18.3132336Z Running 9 items in this shard: test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_adaptive_timer, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_collect_callgrind, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_collect_cpp_callgrind, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_compare, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_cpp_timer, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_fuzzer, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_manipulate_callgrind_stats, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_timer, test/benchmark_utils/test_benchmark_utils.py::TestBenchmarkUtils::test_timer_tiny_fast_snippet 2025-08-14T22:54:18.3135388Z 2025-08-14T22:54:22.3675630Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:54:22.3676985Z import pkg_resources 2025-08-14T22:58:11.8844789Z 2025-08-14T22:58:11.8845489Z test_sparse_csr 2/2 was successful, full logs can be found in artifacts with path test/test-reports/test_sparse_csr_2.2_34fd65573e1f2227_.log 2025-08-14T22:58:11.9745737Z Running 2445 items in this shard: test/test_sparse_csr.py::TestSparseCSRCUDA::test_add_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_add_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_all_sparse_csr_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_all_sparse_csr_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_dense_result_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_dense_result_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_dense_result_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_dense_result_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_dense_result_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_errors_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_0_m_0_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_0_m_0_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_0_m_0_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_0_m_1_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_0_m_25_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_0_m_25_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_0_m_25_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_0_m_25_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_0_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_0_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_0_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_0_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_0_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_1_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_1_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_1_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_1_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_1_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_25_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_25_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_25_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_0_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_0_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_0_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_0_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_1_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_1_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_1_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_1_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_25_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_25_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_25_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_25_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_25_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_0_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_0_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_0_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_0_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_1_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_1_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_1_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_1_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_1_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_25_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_25_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_25_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_25_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_10_m_0_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_10_m_0_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_10_m_0_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_10_m_1_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_10_m_1_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_10_m_1_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_10_m_1_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_10_m_25_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_10_m_25_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_1_m_0_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_1_m_0_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_1_m_0_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_1_m_1_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_1_m_1_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_1_m_1_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_1_m_25_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_0_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_0_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_0_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_0_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_1_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_1_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_25_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_25_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_25_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_25_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_25_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_10_m_0_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_10_m_1_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_10_m_1_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_10_m_25_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_10_m_25_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_1_m_0_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_1_m_0_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_1_m_1_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_1_m_1_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_1_m_1_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_1_m_1_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_1_m_25_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmv_shape_11x9_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmv_shape_11x9_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmv_shape_11x9_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmv_shape_3x3_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmv_shape_5x7_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmv_shape_5x7_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_abs_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_angle_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_angle_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_asin_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_asinh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_atan_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_atanh_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_ceil_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_erf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_erfinv_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_expm1_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_isnan_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_isneginf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_neg_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_positive_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_positive_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_rad2deg_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_sgn_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_sgn_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_signbit_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_sin_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_sin_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_sinh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_trunc_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_baddbmm_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_baddbmm_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_baddbmm_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int32_noncontiguous_False_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int32_noncontiguous_False_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int32_noncontiguous_True_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int32_noncontiguous_True_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int32_noncontiguous_True_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int32_noncontiguous_True_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int64_noncontiguous_False_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int64_noncontiguous_False_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int64_noncontiguous_False_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int64_noncontiguous_True_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int64_noncontiguous_True_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int64_noncontiguous_True_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int64_noncontiguous_True_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int32_noncontiguous_False_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int32_noncontiguous_False_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int32_noncontiguous_False_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int32_noncontiguous_False_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int32_noncontiguous_True_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int64_noncontiguous_False_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int64_noncontiguous_False_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int64_noncontiguous_False_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int64_noncontiguous_True_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int64_noncontiguous_True_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int64_noncontiguous_True_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int64_noncontiguous_True_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int64_noncontiguous_True_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_2_int32_noncontiguous_False_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_2_int32_noncontiguous_False_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_2_int32_noncontiguous_False_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_2_int32_noncontiguous_True_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_2_int64_noncontiguous_False_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_2_int64_noncontiguous_False_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_2_int64_noncontiguous_True_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_3_int32_noncontiguous_False_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_3_int32_noncontiguous_False_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_3_int32_noncontiguous_True_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_3_int64_noncontiguous_False_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_3_int64_noncontiguous_False_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_3_int64_noncontiguous_True_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_3_int64_noncontiguous_True_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int32_noncontiguous_False_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int32_noncontiguous_False_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int32_noncontiguous_False_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int32_noncontiguous_True_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int64_noncontiguous_False_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int64_noncontiguous_False_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int64_noncontiguous_False_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int64_noncontiguous_False_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int64_noncontiguous_True_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int64_noncontiguous_True_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_3_int32_noncontiguous_False_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_3_int32_noncontiguous_False_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_3_int32_noncontiguous_True_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_3_int64_noncontiguous_False_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_3_int64_noncontiguous_True_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_3_int64_noncontiguous_True_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_bmm_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_bmm_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseBSC_SparseBSC_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseBSC_SparseCSR_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseBSR_SparseBSR_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseBSR_SparseCSC_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseCSC_SparseCSC_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseCSR_SparseBSC_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseCSR_SparseCSC_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseCSR_SparseCSR_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_coo_csr_conversion_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_coo_csr_conversion_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_coo_csr_conversion_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_coo_csr_conversion_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_coo_csr_conversion_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_coo_csr_conversion_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_coo_csr_conversion_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_coo_csr_conversion_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_coo_to_csr_convert_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_coo_conversion_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_coo_conversion_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_coo_conversion_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_coo_conversion_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_coo_conversion_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_matvec_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_matvec_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_matvec_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_matvec_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_storage_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_stride_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_to_block_csr_blocksize_2_cuda_float64_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_to_block_csr_blocksize_4_cuda_float64_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_dense_to_from_sparse_compressed_SparseBSC_NonBatched_Hybrid_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_dense_to_from_sparse_compressed_SparseBSC_NonBatched_NonHybrid_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_dense_to_from_sparse_compressed_SparseBSR_Batched_NonHybrid_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_dense_to_from_sparse_compressed_SparseBSR_NonBatched_Hybrid_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_dense_to_from_sparse_compressed_SparseBSR_NonBatched_NonHybrid_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_dense_to_from_sparse_compressed_SparseCSC_NonBatched_NonHybrid_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_direct_coo_csr_conversion_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_direct_coo_csr_conversion_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_direct_coo_csr_conversion_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_exercise_detach_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_exercise_detach_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_exercise_detach_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_exercise_detach_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_matmul_device_mismatch_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mm_errors_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_as_sparse_compressed_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_as_sparse_compressed_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_as_sparse_compressed_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_as_sparse_compressed_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_as_sparse_compressed_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_as_sparse_compressed_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_errors_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_errors_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_errors_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_errors_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_errors_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_errors_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_errors_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_errors_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_autograd_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_autograd_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_errors_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_errors_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_zero_sized_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_zero_sized_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_zero_sized_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int32_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int32_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int32_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int32_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int64_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int64_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int64_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int64_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int64_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int64_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int32_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int32_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int32_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int32_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int64_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int64_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int64_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int64_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int64_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int64_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int64_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int32_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int32_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int32_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int64_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int64_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int64_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int64_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int64_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int64_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int64_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int64_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int32_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int32_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int32_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int32_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int32_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int32_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int64_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int64_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int64_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int64_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int64_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int64_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int64_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int64_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_add_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_add_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_add_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_addmm_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csc_to_dense_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csc_to_dense_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csc_to_dense_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csc_to_dense_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csc_to_dense_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_from_dense_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_from_dense_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_from_dense_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_from_dense_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_from_dense_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_to_dense_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_to_dense_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_to_dense_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_to_dense_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_to_dense_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_to_dense_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_abs_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_abs_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_abs_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_abs_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_abs_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_abs_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_abs_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_angle_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_angle_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_angle_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_angle_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_angle_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asin_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asin_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asin_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asin_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asin_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asin_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asinh_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asinh_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asinh_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asinh_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asinh_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atan_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atan_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atan_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atan_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atan_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atan_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atan_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atan_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atanh_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atanh_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atanh_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atanh_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atanh_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atanh_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atanh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atanh_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atanh_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_ceil_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_ceil_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_ceil_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_ceil_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_ceil_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_conj_physical_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_conj_physical_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_conj_physical_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_conj_physical_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_conj_physical_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_conj_physical_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_conj_physical_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_deg2rad_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_deg2rad_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_deg2rad_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_deg2rad_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erf_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erf_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erf_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erfinv_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erfinv_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erfinv_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erfinv_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_expm1_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_expm1_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_expm1_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_expm1_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_expm1_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_expm1_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_floor_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_floor_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_floor_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_floor_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_frac_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_frac_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_frac_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isinf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isinf_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isinf_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isinf_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isinf_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isinf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isinf_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isinf_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isnan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isnan_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isnan_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isnan_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isnan_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isnan_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isneginf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isneginf_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isneginf_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isneginf_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isneginf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isposinf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isposinf_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isposinf_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isposinf_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_log1p_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_log1p_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_log1p_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_log1p_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_log1p_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_log1p_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_neg_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_neg_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_neg_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_neg_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_neg_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_neg_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_nn_functional_relu_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_positive_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_positive_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_positive_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_positive_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_positive_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_positive_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_rad2deg_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_rad2deg_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_rad2deg_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_rad2deg_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_round_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_round_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_round_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_round_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_round_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sgn_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sgn_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sgn_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sgn_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sgn_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sgn_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sgn_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sgn_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sign_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sign_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_signbit_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_signbit_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_signbit_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_signbit_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_signbit_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sin_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sin_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sin_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sinh_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sinh_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sinh_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sinh_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sqrt_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sqrt_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sqrt_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sqrt_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sqrt_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sqrt_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sqrt_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sqrt_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tan_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tan_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tan_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tan_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tanh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tanh_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tanh_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tanh_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tanh_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tanh_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tanh_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tanh_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_trunc_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_trunc_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_trunc_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_trunc_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_abs_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_abs_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_abs_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_abs_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_angle_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_angle_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_angle_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_angle_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asin_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asin_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asin_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asin_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asin_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asin_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asin_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asinh_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asinh_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asinh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asinh_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atan_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atan_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atan_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atan_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atanh_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atanh_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atanh_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atanh_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atanh_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_ceil_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_ceil_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_ceil_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_conj_physical_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_conj_physical_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_conj_physical_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_conj_physical_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_conj_physical_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_deg2rad_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_deg2rad_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_deg2rad_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_deg2rad_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_deg2rad_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_deg2rad_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_deg2rad_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_deg2rad_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_deg2rad_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_deg2rad_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erf_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erf_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erf_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erfinv_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erfinv_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erfinv_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erfinv_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erfinv_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erfinv_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erfinv_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_expm1_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_expm1_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_floor_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_floor_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_floor_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_floor_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_floor_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_frac_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_frac_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_frac_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_frac_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isinf_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isinf_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isinf_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isinf_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isinf_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isinf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isinf_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isinf_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isnan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isnan_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isnan_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isnan_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isnan_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isnan_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isnan_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isnan_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isnan_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isnan_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isneginf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isneginf_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isneginf_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isposinf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isposinf_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isposinf_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isposinf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isposinf_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isposinf_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_log1p_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_log1p_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_log1p_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_log1p_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_log1p_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_log1p_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_log1p_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_log1p_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_neg_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_neg_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_neg_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_neg_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_neg_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_neg_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_neg_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_nn_functional_relu_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_nn_functional_relu_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_nn_functional_relu_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_positive_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_positive_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_positive_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_positive_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_positive_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_rad2deg_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_rad2deg_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_rad2deg_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_rad2deg_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_rad2deg_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_round_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_round_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_round_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_round_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_round_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_round_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sgn_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sgn_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sgn_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sgn_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sgn_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sgn_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sgn_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sgn_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sign_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sign_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sign_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sign_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sign_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_signbit_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_signbit_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_signbit_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_signbit_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_signbit_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sin_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sin_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sin_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sin_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sin_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sin_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sin_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sin_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sinh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sinh_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sinh_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sinh_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sqrt_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sqrt_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sqrt_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sqrt_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sqrt_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sqrt_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sqrt_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tan_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tan_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tan_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tan_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tan_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tan_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tan_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tanh_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tanh_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tanh_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tanh_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tanh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tanh_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tanh_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_trunc_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_trunc_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_trunc_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_trunc_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_trunc_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_trunc_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_mm_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_mm_reduce_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_mm_reduce_sum_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_to_sparse_compressed_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_to_sparse_compressed_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_triangular_solve_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sum_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sum_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sum_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sum_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sum_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sum_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_abs_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_abs_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_abs_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_abs_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_abs_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_abs_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_abs_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_abs_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_abs_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_angle_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_angle_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_angle_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_angle_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_angle_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_angle_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asin_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asin_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asin_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asin_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asin_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asin_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asinh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asinh_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asinh_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asinh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asinh_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atan_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atan_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atan_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atan_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atan_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atan_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atan_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atanh_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atanh_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atanh_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atanh_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atanh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atanh_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atanh_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_ceil_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_ceil_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_ceil_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_ceil_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_ceil_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_conj_physical_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_conj_physical_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_conj_physical_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_conj_physical_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_conj_physical_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_conj_physical_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_conj_physical_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_conj_physical_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_conj_physical_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_deg2rad_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_deg2rad_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_deg2rad_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_deg2rad_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erf_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erf_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erf_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erf_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erfinv_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erfinv_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erfinv_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erfinv_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erfinv_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erfinv_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_expm1_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_expm1_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_expm1_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_expm1_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_expm1_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_expm1_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_floor_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_floor_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_floor_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_floor_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_floor_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_floor_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_frac_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_frac_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_frac_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isinf_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isinf_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isinf_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isnan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isnan_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isnan_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isnan_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isnan_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isnan_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isneginf_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isneginf_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isneginf_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isposinf_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isposinf_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isposinf_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_log1p_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_log1p_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_log1p_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_log1p_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_log1p_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_neg_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_neg_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_neg_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_neg_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_neg_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_neg_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_neg_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_nn_functional_relu_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_positive_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_positive_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_positive_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_positive_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_rad2deg_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_rad2deg_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_rad2deg_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_rad2deg_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_rad2deg_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_rad2deg_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_rad2deg_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_rad2deg_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_round_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_round_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_round_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sgn_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sgn_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sgn_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sgn_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sgn_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sgn_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sign_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sign_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sign_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_signbit_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_signbit_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_signbit_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sin_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sin_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sin_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sin_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sin_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sin_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sin_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sinh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sinh_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sinh_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sinh_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sqrt_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sqrt_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sqrt_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sqrt_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tan_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tan_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tan_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tan_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tanh_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tanh_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tanh_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tanh_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tanh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tanh_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tanh_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_trunc_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_trunc_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_trunc_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_abs_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_abs_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_abs_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_abs_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_abs_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_abs_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_abs_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_abs_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_angle_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_angle_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_angle_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_angle_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_angle_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_angle_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asin_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asin_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asin_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asin_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asin_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asin_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asin_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asin_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asin_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asinh_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asinh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asinh_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asinh_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asinh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asinh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atan_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atan_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atan_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atan_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atan_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atanh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atanh_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atanh_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atanh_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atanh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_ceil_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_ceil_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_ceil_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_ceil_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_ceil_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_ceil_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_conj_physical_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_conj_physical_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_conj_physical_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_conj_physical_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_conj_physical_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_conj_physical_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_deg2rad_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_deg2rad_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_deg2rad_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_deg2rad_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_deg2rad_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_deg2rad_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erfinv_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erfinv_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erfinv_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erfinv_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erfinv_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erfinv_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erfinv_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_expm1_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_expm1_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_expm1_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_expm1_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_expm1_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_expm1_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_expm1_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_expm1_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_floor_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_floor_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_floor_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_floor_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_frac_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_frac_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isinf_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isinf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isinf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isinf_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isinf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isnan_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isnan_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isnan_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isnan_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isnan_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isneginf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isneginf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isneginf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isneginf_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isneginf_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isposinf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isposinf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isposinf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isposinf_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isposinf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isposinf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_log1p_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_log1p_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_log1p_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amax_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amax_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amax_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amax_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amax_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amax_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amax_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amin_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amin_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amin_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amin_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amin_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_mean_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_mean_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_mean_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_mean_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_mean_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_prod_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_prod_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_prod_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_prod_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_prod_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_sum_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_sum_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_sum_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_sum_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_sum_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_mul_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_mul_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_mul_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_mul_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_mul_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_mul_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_mul_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_mul_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_neg_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_neg_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_neg_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_neg_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_neg_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_neg_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_neg_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_neg_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_nn_functional_relu_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_nn_functional_relu_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_nn_functional_relu_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_positive_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_positive_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_positive_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_positive_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_positive_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_positive_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_positive_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_rad2deg_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_rad2deg_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_rad2deg_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_rad2deg_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_rad2deg_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_rad2deg_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_randn_like_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_randn_like_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_randn_like_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_randn_like_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_round_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_round_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_round_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_round_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_round_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_round_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sgn_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sgn_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sgn_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sgn_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sgn_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sgn_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sign_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sign_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sign_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_signbit_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_signbit_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_signbit_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_signbit_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_signbit_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_signbit_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sin_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sin_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sin_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sin_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sin_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sin_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sin_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sin_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sinh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sinh_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sinh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sinh_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sinh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sinh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sinh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sqrt_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sqrt_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sqrt_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sqrt_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sqrt_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sqrt_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sum_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sum_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sum_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sum_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tan_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tan_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tan_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tan_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tan_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tan_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tanh_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tanh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tanh_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tanh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_to_sparse_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_to_sparse_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_to_sparse_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_to_sparse_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_to_sparse_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_to_sparse_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_to_sparse_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_trunc_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_trunc_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_trunc_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_trunc_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_trunc_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_zeros_like_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_zeros_like_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_zeros_like_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_abs_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_abs_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_angle_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_angle_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_angle_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_angle_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asin_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asin_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asin_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asin_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asin_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asin_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asinh_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asinh_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asinh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asinh_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asinh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asinh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atan_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atan_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atan_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atan_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atan_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atan_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atan_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atan_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atanh_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atanh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atanh_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atanh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atanh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atanh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_ceil_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_ceil_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_conj_physical_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_conj_physical_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_conj_physical_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_conj_physical_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_conj_physical_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_conj_physical_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_conj_physical_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_conj_physical_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_deg2rad_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_deg2rad_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_deg2rad_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erfinv_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erfinv_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erfinv_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erfinv_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erfinv_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_expm1_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_expm1_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_expm1_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_expm1_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_expm1_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_floor_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_floor_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_floor_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_floor_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_floor_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_frac_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_frac_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isinf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isinf_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isinf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isinf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isinf_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isinf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isinf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isnan_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isnan_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isnan_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isnan_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isneginf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isneginf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isneginf_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isneginf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isposinf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isposinf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isposinf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isposinf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_log1p_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_log1p_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_log1p_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_log1p_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_log1p_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_log1p_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_log1p_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_log1p_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_log1p_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_log1p_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amax_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amax_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amax_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amax_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amax_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amin_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amin_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amin_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_mean_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_mean_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_prod_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_prod_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_prod_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_prod_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_prod_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_prod_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_prod_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_sum_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_sum_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_sum_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_sum_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_mul_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_mul_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_mul_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_mul_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_mul_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_mul_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_mul_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_neg_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_neg_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_neg_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_neg_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_neg_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_neg_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_neg_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_neg_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_neg_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_nn_functional_relu_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_nn_functional_relu_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_nn_functional_relu_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_nn_functional_relu_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_nn_functional_relu_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_positive_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_positive_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_positive_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_positive_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_positive_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_positive_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_positive_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_positive_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_positive_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_rad2deg_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_rad2deg_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_rad2deg_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_rad2deg_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_rad2deg_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_randn_like_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_randn_like_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_round_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_round_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_round_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_round_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_round_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sgn_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sgn_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sgn_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sgn_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sgn_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sgn_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sgn_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sign_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sign_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sign_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sign_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_signbit_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_signbit_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_signbit_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_signbit_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_signbit_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sin_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sin_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sin_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sin_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sinh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sinh_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sinh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sinh_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sinh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sinh_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sinh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sqrt_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sqrt_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sqrt_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sqrt_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sqrt_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sum_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sum_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sum_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sum_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tan_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tan_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tan_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tan_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tan_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tan_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tan_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tan_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tanh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tanh_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tanh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tanh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_to_sparse_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_to_sparse_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_to_sparse_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_to_sparse_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_to_sparse_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_trunc_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_trunc_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_zeros_like_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_zeros_like_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_zeros_like_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_zeros_like_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_zeros_like_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_zeros_like_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_zeros_like_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_zeros_like_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_abs_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_abs_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_abs_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_abs_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_abs_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_angle_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_angle_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_angle_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_angle_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asin_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asin_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asin_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asin_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asin_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asin_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asinh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asinh_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asinh_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asinh_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asinh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atan_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atan_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atan_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atan_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atan_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atanh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atanh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atanh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atanh_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atanh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atanh_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atanh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_ceil_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_ceil_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_ceil_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_ceil_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_ceil_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_conj_physical_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_conj_physical_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_conj_physical_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_conj_physical_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_conj_physical_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_conj_physical_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_conj_physical_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_conj_physical_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_conj_physical_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_conj_physical_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_deg2rad_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_deg2rad_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_deg2rad_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_deg2rad_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erfinv_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_expm1_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_expm1_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_expm1_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_floor_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_floor_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_floor_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_floor_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_floor_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_floor_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_frac_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isinf_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isinf_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isinf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isinf_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isinf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isinf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isnan_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isnan_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isnan_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isnan_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isneginf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isneginf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isneginf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isneginf_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isneginf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isneginf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isposinf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isposinf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isposinf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isposinf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_log1p_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_log1p_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_log1p_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_log1p_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_log1p_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amax_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amax_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amax_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amax_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amax_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amin_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amin_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amin_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amin_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_mean_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_mean_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_mean_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_prod_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_prod_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_prod_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_prod_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_prod_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_prod_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_sum_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_sum_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_sum_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_sum_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_sum_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_sum_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_sum_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_mul_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_mul_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_mul_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_mul_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_mul_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_neg_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_neg_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_neg_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_neg_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_neg_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_neg_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_neg_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_neg_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_neg_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_neg_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_nn_functional_relu_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_nn_functional_relu_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_nn_functional_relu_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_nn_functional_relu_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_nn_functional_relu_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_nn_functional_relu_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_nn_functional_relu_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_nn_functional_relu_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_positive_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_positive_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_positive_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_positive_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_positive_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_positive_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_rad2deg_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_rad2deg_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_rad2deg_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_rad2deg_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_rad2deg_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_randn_like_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_randn_like_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_randn_like_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_randn_like_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_round_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_round_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_round_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sgn_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sgn_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sgn_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sgn_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sgn_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sgn_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sgn_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sign_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sign_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sign_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sign_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_signbit_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_signbit_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_signbit_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sin_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sin_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sin_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sin_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sin_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sin_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sin_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sin_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sinh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sinh_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sinh_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sqrt_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sqrt_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sqrt_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sqrt_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sqrt_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sum_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sum_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sum_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sum_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tan_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tan_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tan_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tan_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tan_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tan_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tanh_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tanh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tanh_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tanh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tanh_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tanh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tanh_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tanh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_to_sparse_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_to_sparse_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_to_sparse_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_to_sparse_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_trunc_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_trunc_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_trunc_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_trunc_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_trunc_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_trunc_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_zeros_like_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_zeros_like_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_zeros_like_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_zeros_like_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_zeros_like_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_angle_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_angle_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_angle_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_angle_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_angle_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_angle_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_angle_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asin_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asin_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asin_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asin_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asin_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asinh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atan_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atan_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atan_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atan_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atan_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atan_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atanh_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atanh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atanh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atanh_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atanh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atanh_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atanh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atanh_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atanh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atanh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_ceil_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_ceil_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_ceil_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_ceil_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_conj_physical_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_conj_physical_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_conj_physical_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_conj_physical_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_deg2rad_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_deg2rad_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_deg2rad_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erf_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erfinv_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erfinv_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erfinv_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erfinv_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erfinv_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erfinv_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erfinv_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_expm1_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_expm1_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_expm1_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_expm1_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_expm1_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_expm1_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_expm1_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_expm1_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_expm1_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_expm1_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_floor_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_floor_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_floor_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_floor_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_floor_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_frac_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_frac_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isinf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isinf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isinf_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isinf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isnan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isnan_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isnan_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isnan_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isneginf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isneginf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isneginf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isneginf_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isneginf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isneginf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isneginf_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isneginf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isposinf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isposinf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isposinf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isposinf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_log1p_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_log1p_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_log1p_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_log1p_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_log1p_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amax_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amax_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amax_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amax_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amax_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amin_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amin_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amin_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amin_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_mean_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_mean_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_prod_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_prod_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_prod_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_prod_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_sum_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_sum_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_sum_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_sum_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_sum_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_sum_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_mul_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_mul_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_mul_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_mul_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_mul_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_mul_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_mul_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_mul_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_neg_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_neg_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_neg_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_neg_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_neg_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_neg_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_neg_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_neg_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_neg_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_nn_functional_relu_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_nn_functional_relu_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_nn_functional_relu_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_nn_functional_relu_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_positive_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_positive_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_positive_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_positive_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_positive_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_positive_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_positive_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_positive_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_positive_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_rad2deg_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_rad2deg_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_rad2deg_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_rad2deg_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_rad2deg_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_randn_like_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_randn_like_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_randn_like_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_randn_like_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_round_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_round_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_round_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_round_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_round_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sgn_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sgn_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sgn_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sgn_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sign_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sign_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sign_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sign_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sign_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sign_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_signbit_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_signbit_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_signbit_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_signbit_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_signbit_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sin_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sin_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sin_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sin_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sinh_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sinh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sinh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sinh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sqrt_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sqrt_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sqrt_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sqrt_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sum_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sum_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sum_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sum_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tan_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tan_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tan_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tan_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tan_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tanh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tanh_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tanh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tanh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tanh_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tanh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tanh_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tanh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_to_sparse_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_to_sparse_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_to_sparse_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_to_sparse_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_to_sparse_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_to_sparse_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_trunc_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_trunc_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_trunc_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_trunc_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_zeros_like_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_zeros_like_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_zeros_like_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_zeros_like_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_zeros_like_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_dim_SparseBSC_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_dim_SparseBSR_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_dim_SparseCSC_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_dim_SparseCSR_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_invalid_input_SparseBSC_target_sparse_compressed_tensor_no_size_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_invalid_input_SparseBSR_target_sparse_compressed_tensor_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_invalid_input_SparseBSR_target_sparse_compressed_tensor_no_size_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_layout_SparseBSR_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_layout_SparseCSR_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_pickle_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_pickle_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_pickle_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_print_SparseBSR_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_print_SparseCSR_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int32_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int32_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int32_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int32_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int32_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int32_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int32_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int64_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int64_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int64_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int64_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int64_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int64_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int64_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int32_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int32_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int32_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int32_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int64_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int64_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int64_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int64_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int64_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int32_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int32_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int64_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int64_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int64_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int64_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int64_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int32_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int64_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int64_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int64_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int64_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int64_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int64_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int64_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int64_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int64_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_16_int64_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_16_int64_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_16_int64_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_32_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_32_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_32_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_32_int64_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_32_int64_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_64_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_64_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_64_int64_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_scatter_mm_blocksize_16_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_scatter_mm_blocksize_2_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_scatter_mm_blocksize_2_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_scatter_mm_blocksize_2x3_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_scatter_mm_blocksize_32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_softmax_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_softmax_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_softmax_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_16_out_dtype_unspecified_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_16x32_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_16x32_out_dtype_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_16x32_out_dtype_unspecified_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_16x32_out_dtype_unspecified_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_16x32_out_dtype_unspecified_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_16x32_out_dtype_unspecified_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_32_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_32_out_dtype_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_32_out_dtype_unspecified_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_32_out_dtype_unspecified_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_16_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_16_out_dtype_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_16_out_dtype_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_16x32_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_16x32_out_dtype_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_16x32_out_dtype_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_32_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_32_out_dtype_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_32_out_dtype_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_32_out_dtype_unspecified_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_32_out_dtype_unspecified_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16_out_dtype_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16_out_dtype_unspecified_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16_out_dtype_unspecified_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16_out_dtype_unspecified_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16x32_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16x32_out_dtype_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16x32_out_dtype_unspecified_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_32_out_dtype_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_32_out_dtype_unspecified_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_32_out_dtype_unspecified_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_16_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_16_out_dtype_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_16_out_dtype_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_16_out_dtype_unspecified_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_16x32_out_dtype_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_16x32_out_dtype_unspecified_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_16x32_out_dtype_unspecified_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_32_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_32_out_dtype_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_32_out_dtype_unspecified_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_32_out_dtype_unspecified_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_sampled_addmm_block_size_64_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_sampled_addmm_block_size_64_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_sampled_addmm_block_size_64_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_scaled_dot_product_attention_block_size_16_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_scaled_dot_product_attention_block_size_32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_scaled_dot_product_attention_block_size_32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_scaled_dot_product_attention_block_size_64_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_scaled_dot_product_attention_block_size_64_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_scatter_mm_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op__int_bsr_dense_addmm_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op__int_bsr_dense_addmm_out_dtype_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op__int_bsr_dense_addmm_out_dtype_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op__int_bsr_dense_addmm_out_dtype_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op__int_bsr_dense_addmm_out_dtype_unspecified_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op_bsr_dense_addmm_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op_bsr_dense_addmm_out_dtype_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op_bsr_dense_addmm_out_dtype_unspecified_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op_bsr_dense_addmm_out_dtype_unspecified_cuda_int8 2025-08-14T22:58:12.0615765Z 2025-08-14T22:58:13.6221039Z Uploading artifacts took 1.73 seconds 2025-08-14T22:58:15.9420379Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/hypothesis/entry_points.py:23: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. 2025-08-14T22:58:15.9421817Z import pkg_resources 2025-08-14T22:58:29.0769609Z 2025-08-14T22:58:29.0770654Z torch_np/numpy_tests/core/test_multiarray 1/2 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_multiarray_1.2_47a564619079a337_.log 2025-08-14T22:58:29.0923087Z Running 434 items in this shard: test/torch_np/numpy_tests/core/test_multiarray.py::TestFlag::test_readonly_flag_protocols_flag__warn_on_write_flag_value_True_writeable_False, test/torch_np/numpy_tests/core/test_multiarray.py::TestFlag::test_readonly_flag_protocols_flag_writeable_flag_value_False_writeable_False, test/torch_np/numpy_tests/core/test_multiarray.py::TestFlag::test_readonly_flag_protocols_flag_writeable_flag_value_True_writeable_True, test/torch_np/numpy_tests/core/test_multiarray.py::TestFlag::test_string_align, test/torch_np/numpy_tests/core/test_multiarray.py::TestFlag::test_warnonwrite, test/torch_np/numpy_tests/core/test_multiarray.py::TestFlag::test_writeable, test/torch_np/numpy_tests/core/test_multiarray.py::TestFlag::test_writeable_any_base, test/torch_np/numpy_tests/core/test_multiarray.py::TestFlag::test_writeable_from_buffer, test/torch_np/numpy_tests/core/test_multiarray.py::TestHash::test_int, test/torch_np/numpy_tests/core/test_multiarray.py::TestAttributes::test_attributes, test/torch_np/numpy_tests/core/test_multiarray.py::TestAttributes::test_fill, test/torch_np/numpy_tests/core/test_multiarray.py::TestAttributes::test_fill_readonly, test/torch_np/numpy_tests/core/test_multiarray.py::TestAttributes::test_set_stridesattr, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayConstruction::test_array_as_keyword_array, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayConstruction::test_array_as_keyword_asanyarray, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayConstruction::test_array_as_keyword_ascontiguousarray, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayConstruction::test_array_cont, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayConstruction::test_array_copy_false, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayConstruction::test_array_copy_false_2, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayConstruction::test_array_copy_true, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayConstruction::test_bad_arguments_error_asanyarray, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayConstruction::test_bad_arguments_error_ascontiguousarray, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayConstruction::test_bad_arguments_error_asfortranarray, test/torch_np/numpy_tests/core/test_multiarray.py::TestAssignment::test_assignment_broadcasting, test/torch_np/numpy_tests/core/test_multiarray.py::TestAssignment::test_assignment_errors, test/torch_np/numpy_tests/core/test_multiarray.py::TestAssignment::test_cast_to_string, test/torch_np/numpy_tests/core/test_multiarray.py::TestAssignment::test_longdouble_assignment, test/torch_np/numpy_tests/core/test_multiarray.py::TestAssignment::test_stringlike_empty_list, test/torch_np/numpy_tests/core/test_multiarray.py::TestScalarIndexing::test_invalid_newaxis, test/torch_np/numpy_tests/core/test_multiarray.py::TestScalarIndexing::test_newaxis, test/torch_np/numpy_tests/core/test_multiarray.py::TestCreation::test_array_too_big, test/torch_np/numpy_tests/core/test_multiarray.py::TestCreation::test_empty_unicode, test/torch_np/numpy_tests/core/test_multiarray.py::TestCreation::test_false_len_iterable, test/torch_np/numpy_tests/core/test_multiarray.py::TestCreation::test_false_len_sequence, test/torch_np/numpy_tests/core/test_multiarray.py::TestCreation::test_from_attribute, test/torch_np/numpy_tests/core/test_multiarray.py::TestCreation::test_from_string, test/torch_np/numpy_tests/core/test_multiarray.py::TestCreation::test_object_initialized_to_None_dtype0_function0, test/torch_np/numpy_tests/core/test_multiarray.py::TestCreation::test_object_initialized_to_None_dtype_(2,3)O_function2, test/torch_np/numpy_tests/core/test_multiarray.py::TestCreation::test_object_initialized_to_None_dtype_O,(3)O_function0, test/torch_np/numpy_tests/core/test_multiarray.py::TestCreation::test_object_initialized_to_None_dtype_O,(3)O_function2, test/torch_np/numpy_tests/core/test_multiarray.py::TestCreation::test_object_initialized_to_None_dtype_O,O_function0, test/torch_np/numpy_tests/core/test_multiarray.py::TestCreation::test_object_initialized_to_None_dtype_O,O_function2, test/torch_np/numpy_tests/core/test_multiarray.py::TestCreation::test_ragged_ndim_object, test/torch_np/numpy_tests/core/test_multiarray.py::TestCreation::test_ragged_shape_object, test/torch_np/numpy_tests/core/test_multiarray.py::TestCreation::test_sequence_non_homogeneous, test/torch_np/numpy_tests/core/test_multiarray.py::TestCreation::test_structured_void_promotion_arr, test/torch_np/numpy_tests/core/test_multiarray.py::TestCreation::test_structured_void_promotion_scalar, test/torch_np/numpy_tests/core/test_multiarray.py::TestCreation::test_too_big_error, test/torch_np/numpy_tests/core/test_multiarray.py::TestCreation::test_zeros_big, test/torch_np/numpy_tests/core/test_multiarray.py::TestBool::test_cast_from_bytes, test/torch_np/numpy_tests/core/test_multiarray.py::TestBool::test_cast_from_void, test/torch_np/numpy_tests/core/test_multiarray.py::TestBool::test_count_nonzero_all, test/torch_np/numpy_tests/core/test_multiarray.py::TestBool::test_sum, test/torch_np/numpy_tests/core/test_multiarray.py::TestBool::test_test_interning, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_any_where, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_argpartition_empty_array_kth_dtype_B, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_argpartition_empty_array_kth_dtype_b, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_argpartition_empty_array_kth_dtype_h, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_argpartition_empty_array_kth_dtype_l, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_argpartition_gh5524_kth_dtype_i, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_argpartition_gh5524_kth_dtype_l, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_argpartition_integer, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_argpartition_out_of_range_dtype_D, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_argpartition_out_of_range_dtype_d, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_argpartition_out_of_range_dtype_i, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_argsort_axis, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_argsort_complex, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_arr_mult_func0, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_arr_mult_func1, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_conjugate_out, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_copy, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_diagonal_view_notwriteable, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_dot, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_dot_out_mem_overlap, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_flatten, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_no_dgemv_2_func0_dtype_D, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_no_dgemv_2_func0_dtype_f, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_no_dgemv_2_func0_dtype_i, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_no_dgemv_2_func1_dtype_F, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_no_dgemv_2_func1_dtype_d, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_no_dgemv_2_func1_dtype_f, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_no_dgemv_func0_dtype_D, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_no_dgemv_func0_dtype_F, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_no_dgemv_func0_dtype_d, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_partition, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_partition_empty_array_kth_dtype_h, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_partition_empty_array_kth_dtype_i, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_partition_empty_array_kth_dtype_l, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_partition_fuzz, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_partition_integer, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_partition_iterative, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_partition_out_of_range_dtype_B, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_partition_out_of_range_dtype_F, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_partition_out_of_range_dtype_b, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_partition_out_of_range_dtype_e, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_partition_out_of_range_dtype_i, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_partition_out_of_range_dtype_l, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_prod, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_ravel, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_searchsorted_complex, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_searchsorted_floats_f32, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_sort, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_sort_complex_dtype0_part_imag, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_sort_complex_dtype0_part_real, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_sort_complex_dtype1_part_imag, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_sort_complex_nans, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_sort_degraded, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_sort_signed_dtype0, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_sort_signed_dtype2, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_sort_signed_dtype3, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_sort_signed_dtype4, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_sort_signed_dtype6, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_sort_unsigned_dtype0, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_sort_unsigned_dtype1, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_sort_unsigned_dtype2, test/torch_np/numpy_tests/core/test_multiarray.py::TestMethods::test_squeeze, test/torch_np/numpy_tests/core/test_multiarray.py::TestFancyIndexing::test_assign_mask, test/torch_np/numpy_tests/core/test_multiarray.py::TestFancyIndexing::test_list, test/torch_np/numpy_tests/core/test_multiarray.py::TestFancyIndexing::test_mask, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size0_axis0_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size11_axis_0_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size12_axis_1_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size13_axis13_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size16_axis_0_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size16_axis_0_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size17_axis_1_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size19_axis_-3_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size19_axis_-3_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size1_axis_-1_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size20_axis_-2_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size21_axis_-1_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size21_axis_-1_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size22_axis_0_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size23_axis_1_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size23_axis_1_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size24_axis_2_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size25_axis25_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size25_axis25_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size26_axis_-3_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size26_axis_-3_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size27_axis_-2_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size28_axis_-1_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size29_axis_0_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size29_axis_0_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size2_axis_0_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size30_axis_1_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size30_axis_1_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size31_axis_2_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size34_axis_-3_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size35_axis_-2_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size37_axis_0_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size39_axis_2_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size41_axis41_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size43_axis_-3_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size44_axis_-2_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size44_axis_-2_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size45_axis_-1_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size46_axis_0_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size46_axis_0_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size47_axis_1_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size4_axis_-2_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size50_axis50_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size52_axis_-3_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size53_axis_-2_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size54_axis_-1_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size55_axis_0_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size55_axis_0_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size56_axis_1_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size56_axis_1_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size58_axis_3_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size59_axis59_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size59_axis59_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size5_axis_-1_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size60_axis_-4_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size61_axis_-3_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size62_axis_-2_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size62_axis_-2_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size63_axis_-1_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size64_axis_0_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size65_axis_1_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size67_axis_3_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size68_axis68_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size68_axis68_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size69_axis_-1_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size6_axis_0_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size70_axis_0_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size70_axis_0_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size71_axis71_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size73_axis_0_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size73_axis_0_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size74_axis74_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size74_axis74_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size75_axis_-1_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size76_axis_0_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size77_axis77_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size7_axis_1_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size7_axis_1_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size8_axis8_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size9_axis_-2_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size9_axis_-2_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_vs_ndarray_arr_method_argmax_np_method0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_np_vs_ndarray_arr_method_argmin_np_method1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_output_shape_method_argmax, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_ret_is_out_ndim_0_method_argmax, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_ret_is_out_ndim_0_method_argmin, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmaxArgminCommon::test_ret_is_out_ndim_1_method_argmax, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmax::test_combinations_data0, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmax::test_combinations_data13, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmax::test_combinations_data15, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmax::test_combinations_data16, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmax::test_combinations_data21, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmax::test_combinations_data25, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmax::test_combinations_data26, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmax::test_combinations_data29, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmax::test_combinations_data3, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmax::test_combinations_data30, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmax::test_combinations_data34, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmax::test_combinations_data39, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmax::test_combinations_data42, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmax::test_combinations_data44, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmax::test_combinations_data45, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmax::test_combinations_data46, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmax::test_combinations_data52, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmax::test_combinations_data56, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmax::test_combinations_data58, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmax::test_combinations_data8, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmax::test_maximum_signed_integers, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data11, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data15, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data16, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data17, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data26, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data27, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data28, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data3, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data30, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data32, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data33, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data34, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data35, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data37, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data40, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data41, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data43, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data45, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data46, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data48, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data5, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data50, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data51, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data52, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data57, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data7, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data8, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_combinations_data9, test/torch_np/numpy_tests/core/test_multiarray.py::TestArgmin::test_minimum_signed_integers, test/torch_np/numpy_tests/core/test_multiarray.py::TestMinMax::test_axis, test/torch_np/numpy_tests/core/test_multiarray.py::TestNewaxis::test_basic, test/torch_np/numpy_tests/core/test_multiarray.py::TestClip::test_basic, test/torch_np/numpy_tests/core/test_multiarray.py::TestClip::test_nan, test/torch_np/numpy_tests/core/test_multiarray.py::TestCompress::test_axis, test/torch_np/numpy_tests/core/test_multiarray.py::TestCompress::test_flatten, test/torch_np/numpy_tests/core/test_multiarray.py::TestCompress::test_truncate, test/torch_np/numpy_tests/core/test_multiarray.py::TestPutmask::test_byteorder_greater_True, test/torch_np/numpy_tests/core/test_multiarray.py::TestPutmask::test_overlaps, test/torch_np/numpy_tests/core/test_multiarray.py::TestTake::test_clip, test/torch_np/numpy_tests/core/test_multiarray.py::TestTake::test_ip_types, test/torch_np/numpy_tests/core/test_multiarray.py::TestTake::test_out_overlap, test/torch_np/numpy_tests/core/test_multiarray.py::TestTake::test_ret_is_out_shape0, test/torch_np/numpy_tests/core/test_multiarray.py::TestTake::test_ret_is_out_shape1, test/torch_np/numpy_tests/core/test_multiarray.py::TestLexsort::test_basic_dtype0, test/torch_np/numpy_tests/core/test_multiarray.py::TestLexsort::test_basic_dtype2, test/torch_np/numpy_tests/core/test_multiarray.py::TestLexsort::test_basic_dtype3, test/torch_np/numpy_tests/core/test_multiarray.py::TestLexsort::test_basic_dtype4, test/torch_np/numpy_tests/core/test_multiarray.py::TestLexsort::test_basic_dtype5, test/torch_np/numpy_tests/core/test_multiarray.py::TestLexsort::test_datetime, test/torch_np/numpy_tests/core/test_multiarray.py::TestLexsort::test_mixed, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_ascii, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_binary, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_counted_string, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_dtype, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_dtype_bool, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_empty_files_text, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_file_position_after_fromfile, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_file_position_after_tofile, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_fromfile_bad_dup, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_fromfile_offset, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_inf, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_io_open_buffered_fromfile, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_largish_file, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_load_object_array_fromfile, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_malformed, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_nofile, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_numbers, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_read_shorter_than_count_subarray, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_roundtrip_binary_str, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_roundtrip_dump_pathlib, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_roundtrip_repr, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_string, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_string_with_ws, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_tofile_cleanup, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_tofile_format, test/torch_np/numpy_tests/core/test_multiarray.py::TestIO::test_unseekable_fromfile, test/torch_np/numpy_tests/core/test_multiarray.py::TestFromBuffer::test_basic_little_dtype1, test/torch_np/numpy_tests/core/test_multiarray.py::TestFromBuffer::test_mmap_close, test/torch_np/numpy_tests/core/test_multiarray.py::TestResize::test_basic, test/torch_np/numpy_tests/core/test_multiarray.py::TestResize::test_empty_view, test/torch_np/numpy_tests/core/test_multiarray.py::TestResize::test_freeform_shape, test/torch_np/numpy_tests/core/test_multiarray.py::TestResize::test_int_shape, test/torch_np/numpy_tests/core/test_multiarray.py::TestResize::test_none_shape, test/torch_np/numpy_tests/core/test_multiarray.py::TestResize::test_zeros_appended, test/torch_np/numpy_tests/core/test_multiarray.py::TestStats::test_ddof, test/torch_np/numpy_tests/core/test_multiarray.py::TestStats::test_dtype_from_dtype, test/torch_np/numpy_tests/core/test_multiarray.py::TestStats::test_dtype_from_input, test/torch_np/numpy_tests/core/test_multiarray.py::TestStats::test_keepdims, test/torch_np/numpy_tests/core/test_multiarray.py::TestStats::test_mean_float16, test/torch_np/numpy_tests/core/test_multiarray.py::TestStats::test_mean_where, test/torch_np/numpy_tests/core/test_multiarray.py::TestStats::test_out, test/torch_np/numpy_tests/core/test_multiarray.py::TestStats::test_python_type, test/torch_np/numpy_tests/core/test_multiarray.py::TestStats::test_var_axis_error, test/torch_np/numpy_tests/core/test_multiarray.py::TestStats::test_var_complex_byteorder, test/torch_np/numpy_tests/core/test_multiarray.py::TestStats::test_var_dimensions, test/torch_np/numpy_tests/core/test_multiarray.py::TestStats::test_var_values, test/torch_np/numpy_tests/core/test_multiarray.py::TestVdot::test_basic, test/torch_np/numpy_tests/core/test_multiarray.py::TestVdot::test_vdot_array_order, test/torch_np/numpy_tests/core/test_multiarray.py::TestVdot::test_vdot_uncontiguous, test/torch_np/numpy_tests/core/test_multiarray.py::TestDot::test_all, test/torch_np/numpy_tests/core/test_multiarray.py::TestDot::test_dot_2args, test/torch_np/numpy_tests/core/test_multiarray.py::TestDot::test_dot_3args, test/torch_np/numpy_tests/core/test_multiarray.py::TestDot::test_dot_3args_errors, test/torch_np/numpy_tests/core/test_multiarray.py::TestDot::test_dotcolumnvect2, test/torch_np/numpy_tests/core/test_multiarray.py::TestDot::test_dotmatmat, test/torch_np/numpy_tests/core/test_multiarray.py::TestDot::test_dotmatvec, test/torch_np/numpy_tests/core/test_multiarray.py::TestDot::test_dotmatvec2, test/torch_np/numpy_tests/core/test_multiarray.py::TestDot::test_dotvecscalar2, test/torch_np/numpy_tests/core/test_multiarray.py::TestDot::test_dotvecvecinner, test/torch_np/numpy_tests/core/test_multiarray.py::TestDot::test_dotvecvecouter, test/torch_np/numpy_tests/core/test_multiarray.py::TestDot::test_huge_vectordot_dtype0, test/torch_np/numpy_tests/core/test_multiarray.py::TestDot::test_huge_vectordot_dtype1, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_dot_equivalent_mm2, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_dot_equivalent_mm3, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_dot_equivalent_mmN3, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_dot_equivalent_mmT1, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_dot_equivalent_mmT2, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_dot_equivalent_mmT4, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_dot_equivalent_mv11, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_dot_equivalent_mvN1, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_dot_equivalent_mvN2, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_dot_equivalent_mvN3, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_dot_equivalent_mvN4, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_dot_equivalent_mvN5, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_dot_equivalent_mvN7, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_dot_equivalent_mvN8, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_dot_equivalent_s0_2, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_dot_equivalent_s0_4, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_dot_equivalent_vm1, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_dot_equivalent_vm3, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_dot_equivalent_vm4, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_empty_out, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_exceptions, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_matmul_bool, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_out_contiguous, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_result_types_2, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_scalar_output, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_shapes, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_vector_matrix_values, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmul::test_vector_vector_values, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmulOperator::test_exceptions, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmulOperator::test_matmul_axes, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmulOperator::test_matmul_inplace, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmulOperator::test_matmul_inplace_2, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmulOperator::test_matmul_raises, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmulOperator::test_result_types, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmulOperator::test_scalar_output, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmulOperator::test_shapes, test/torch_np/numpy_tests/core/test_multiarray.py::TestMatmulOperator::test_vector_vector_values, test/torch_np/numpy_tests/core/test_multiarray.py::TestInner::test_3d_tensor, test/torch_np/numpy_tests/core/test_multiarray.py::TestInner::test_inner_scalar_and_vector, test/torch_np/numpy_tests/core/test_multiarray.py::TestChoose::test_basic, test/torch_np/numpy_tests/core/test_multiarray.py::TestChoose::test_broadcast1, test/torch_np/numpy_tests/core/test_multiarray.py::TestChoose::test_docstring_2, test/torch_np/numpy_tests/core/test_multiarray.py::TestChoose::test_docstring_3, test/torch_np/numpy_tests/core/test_multiarray.py::TestChoose::test_output_dtype_ops0, test/torch_np/numpy_tests/core/test_multiarray.py::TestChoose::test_output_dtype_ops3, test/torch_np/numpy_tests/core/test_multiarray.py::TestRepeat::test_axis_spec, test/torch_np/numpy_tests/core/test_multiarray.py::TestRepeat::test_basic, test/torch_np/numpy_tests/core/test_multiarray.py::TestMinScalarType::test_usigned_shortshort, test/torch_np/numpy_tests/core/test_multiarray.py::TestPEP3118Dtype::test_byteorder_inside_struct, test/torch_np/numpy_tests/core/test_multiarray.py::TestPEP3118Dtype::test_char_vs_string, test/torch_np/numpy_tests/core/test_multiarray.py::TestPEP3118Dtype::test_intra_padding, test/torch_np/numpy_tests/core/test_multiarray.py::TestPEP3118Dtype::test_native_padding, test/torch_np/numpy_tests/core/test_multiarray.py::TestPEP3118Dtype::test_native_padding_2, test/torch_np/numpy_tests/core/test_multiarray.py::TestPEP3118Dtype::test_native_padding_3, test/torch_np/numpy_tests/core/test_multiarray.py::TestPEP3118Dtype::test_unnamed_fields, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayCreationCopyArgument::test_array_interfaces, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayCreationCopyArgument::test_order_mismatch_arr0_order12_order2_A, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayCreationCopyArgument::test_order_mismatch_arr0_order12_order2_F, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayCreationCopyArgument::test_order_mismatch_arr0_order12_order2_K, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayCreationCopyArgument::test_order_mismatch_arr0_order1_C_order2_K, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayCreationCopyArgument::test_order_mismatch_arr0_order1_F_order2_C, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayCreationCopyArgument::test_order_mismatch_arr1_order12_order2_A, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayCreationCopyArgument::test_order_mismatch_arr1_order1_C_order2_C, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayCreationCopyArgument::test_order_mismatch_arr1_order1_F_order2_C, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayCreationCopyArgument::test_order_mismatch_arr1_order1_F_order2_F, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayCreationCopyArgument::test_order_mismatch_arr1_order1_F_order2_K, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayCreationCopyArgument::test_scalars, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayCreationCopyArgument::test_striding_not_ok, test/torch_np/numpy_tests/core/test_multiarray.py::TestArrayAttributeDeletion::test_multiarray_writable_attributes_deletion, test/torch_np/numpy_tests/core/test_multiarray.py::TestDelMisc::test_flat_element_deletion, test/torch_np/numpy_tests/core/test_multiarray.py::TestConversion::test_to_int_scalar, test/torch_np/numpy_tests/core/test_multiarray.py::TestWhere::test_dtype_mix, test/torch_np/numpy_tests/core/test_multiarray.py::TestWhere::test_empty_result, test/torch_np/numpy_tests/core/test_multiarray.py::TestWhere::test_error, test/torch_np/numpy_tests/core/test_multiarray.py::TestWhere::test_ndim, test/torch_np/numpy_tests/core/test_multiarray.py::TestHashing::test_arrays_not_hashable, test/torch_np/numpy_tests/core/test_multiarray.py::TestHashing::test_collections_hashable, test/torch_np/numpy_tests/core/test_multiarray.py::TestFormat::test_0d, test/torch_np/numpy_tests/core/test_multiarray.py::TestFormat::test_1d_format, test/torch_np/numpy_tests/core/test_multiarray.py::TestWritebackIfCopy::test_dot_out, test/torch_np/numpy_tests/core/test_multiarray.py::TestWritebackIfCopy::test_put_noncontiguous, test/torch_np/numpy_tests/core/test_multiarray.py::TestArange::test_explicit_dtype_dt1, test/torch_np/numpy_tests/core/test_multiarray.py::TestArange::test_explicit_dtype_dt2, test/torch_np/numpy_tests/core/test_multiarray.py::TestArange::test_infinite, test/torch_np/numpy_tests/core/test_multiarray.py::TestArange::test_nan_step, test/torch_np/numpy_tests/core/test_multiarray.py::TestArange::test_zero_step, test/torch_np/numpy_tests/core/test_multiarray.py::TestSortFloatMisc::test_sort_float_N_1023, test/torch_np/numpy_tests/core/test_multiarray.py::TestSortFloatMisc::test_sort_float_N_151, test/torch_np/numpy_tests/core/test_multiarray.py::TestSortFloatMisc::test_sort_float_N_16, test/torch_np/numpy_tests/core/test_multiarray.py::TestSortFloatMisc::test_sort_float_N_2047, test/torch_np/numpy_tests/core/test_multiarray.py::TestSortFloatMisc::test_sort_float_N_24, test/torch_np/numpy_tests/core/test_multiarray.py::TestSortFloatMisc::test_sort_float_N_32, test/torch_np/numpy_tests/core/test_multiarray.py::TestSortFloatMisc::test_sort_float_N_383, test/torch_np/numpy_tests/core/test_multiarray.py::TestSortFloatMisc::test_sort_float_N_48, test/torch_np/numpy_tests/core/test_multiarray.py::TestSortFloatMisc::test_sort_float_N_64 2025-08-14T22:58:29.1071068Z 2025-08-14T22:58:30.1113259Z Running test batch 'tests to run' cost 5277.55 seconds 2025-08-14T22:58:31.0394452Z 2025-08-14T22:58:31.0394855Z real 88m3.818s 2025-08-14T22:58:31.0395169Z user 236m51.870s 2025-08-14T22:58:31.0395486Z sys 29m32.637s 2025-08-14T22:58:31.0395763Z + assert_git_not_dirty 2025-08-14T22:58:31.0396075Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 != *rocm* ]] 2025-08-14T22:58:31.0396481Z + [[ linux-jammy-cuda12.8-py3.10-gcc11-sm86 != *xla* ]] 2025-08-14T22:58:31.0402039Z ++ git status --porcelain 2025-08-14T22:58:31.0402411Z ++ grep -v '?? third_party' 2025-08-14T22:58:40.4782141Z ++ true 2025-08-14T22:58:40.4782945Z + git_status= 2025-08-14T22:58:40.4783194Z + [[ -n '' ]] 2025-08-14T22:58:40.4784811Z + sccache_epilogue 2025-08-14T22:58:40.4785522Z + echo '::group::Sccache Compilation Log' 2025-08-14T22:58:40.4786144Z ##[group]Sccache Compilation Log 2025-08-14T22:58:40.4786487Z + echo '=================== sccache compilation log ===================' 2025-08-14T22:58:40.4786865Z =================== sccache compilation log =================== 2025-08-14T22:58:40.4787430Z + python /var/lib/jenkins/workspace/.ci/pytorch/print_sccache_log.py /var/lib/jenkins/sccache_error.log 2025-08-14T22:58:40.4934448Z + echo '=========== If your build fails, please take a look at the log above for possible reasons ===========' 2025-08-14T22:58:40.4935192Z =========== If your build fails, please take a look at the log above for possible reasons =========== 2025-08-14T22:58:40.4935641Z + sccache --show-stats 2025-08-14T22:58:40.4974605Z Compile requests 504 2025-08-14T22:58:40.4974938Z Compile requests executed 57 2025-08-14T22:58:40.4975291Z Cache hits 12 2025-08-14T22:58:40.4975600Z Cache hits (C/C++) 12 2025-08-14T22:58:40.4975889Z Cache misses 45 2025-08-14T22:58:40.4976177Z Cache misses (C/C++) 45 2025-08-14T22:58:40.4976474Z Cache hits rate 21.05 % 2025-08-14T22:58:40.4976774Z Cache hits rate (C/C++) 21.05 % 2025-08-14T22:58:40.4977073Z Cache timeouts 0 2025-08-14T22:58:40.4977369Z Cache read errors 0 2025-08-14T22:58:40.4977662Z Forced recaches 0 2025-08-14T22:58:40.4977966Z Cache write errors 0 2025-08-14T22:58:40.4978255Z Cache errors 0 2025-08-14T22:58:40.4978544Z Compilations 45 2025-08-14T22:58:40.4978843Z Compilation failures 0 2025-08-14T22:58:40.4979158Z Non-cacheable compilations 0 2025-08-14T22:58:40.4979462Z Non-cacheable calls 13 2025-08-14T22:58:40.4979770Z Non-compilation calls 434 2025-08-14T22:58:40.4980080Z Unsupported compiler calls 0 2025-08-14T22:58:40.4980391Z Average cache write 0.062 s 2025-08-14T22:58:40.4980690Z Average compiler 13.509 s 2025-08-14T22:58:40.4980997Z Average cache read hit 0.065 s 2025-08-14T22:58:40.4981310Z Failed distributed compilations 0 2025-08-14T22:58:40.4981514Z 2025-08-14T22:58:40.4981611Z Non-cacheable reasons: 2025-08-14T22:58:40.4981871Z -E 11 2025-08-14T22:58:40.4982178Z unknown source language 2 2025-08-14T22:58:40.4982378Z 2025-08-14T22:58:40.4982602Z Cache location s3, name: ossci-compiler-cache-circleci-v2, prefix: / 2025-08-14T22:58:40.4983027Z Version (client) 0.10.0 2025-08-14T22:58:40.4983336Z + sccache --stop-server 2025-08-14T22:58:40.5007207Z Stopping sccache server... 2025-08-14T22:58:40.5011787Z Compile requests 504 2025-08-14T22:58:40.5012143Z Compile requests executed 57 2025-08-14T22:58:40.5012442Z Cache hits 12 2025-08-14T22:58:40.5012721Z Cache hits (C/C++) 12 2025-08-14T22:58:40.5013022Z Cache misses 45 2025-08-14T22:58:40.5013316Z Cache misses (C/C++) 45 2025-08-14T22:58:40.5013650Z Cache hits rate 21.05 % 2025-08-14T22:58:40.5013953Z Cache hits rate (C/C++) 21.05 % 2025-08-14T22:58:40.5014266Z Cache timeouts 0 2025-08-14T22:58:40.5014566Z Cache read errors 0 2025-08-14T22:58:40.5014856Z Forced recaches 0 2025-08-14T22:58:40.5015168Z Cache write errors 0 2025-08-14T22:58:40.5015462Z Cache errors 0 2025-08-14T22:58:40.5015749Z Compilations 45 2025-08-14T22:58:40.5016053Z Compilation failures 0 2025-08-14T22:58:40.5016373Z Non-cacheable compilations 0 2025-08-14T22:58:40.5016678Z Non-cacheable calls 13 2025-08-14T22:58:40.5016980Z Non-compilation calls 434 2025-08-14T22:58:40.5017290Z Unsupported compiler calls 0 2025-08-14T22:58:40.5017604Z Average cache write 0.062 s 2025-08-14T22:58:40.5017903Z Average compiler 13.509 s 2025-08-14T22:58:40.5018213Z Average cache read hit 0.065 s 2025-08-14T22:58:40.5018535Z Failed distributed compilations 0 2025-08-14T22:58:40.5018820Z 2025-08-14T22:58:40.5019095Z Non-cacheable reasons: 2025-08-14T22:58:40.5019499Z -E 11 2025-08-14T22:58:40.5019867Z unknown source language 2 2025-08-14T22:58:40.5020064Z 2025-08-14T22:58:40.5020290Z Cache location s3, name: ossci-compiler-cache-circleci-v2, prefix: / 2025-08-14T22:58:40.5020712Z Version (client) 0.10.0 2025-08-14T22:58:40.5021019Z + echo ::endgroup:: 2025-08-14T22:58:40.5021551Z ##[endgroup] 2025-08-14T22:58:40.5021762Z + cleanup_workspace 2025-08-14T22:58:40.5022225Z + echo 'sudo may print the following warning message that can be ignored. The chown command will still run.' 2025-08-14T22:58:40.5022924Z sudo may print the following warning message that can be ignored. The chown command will still run. 2025-08-14T22:58:40.5023504Z + echo ' sudo: setrlimit(RLIMIT_STACK): Operation not permitted' 2025-08-14T22:58:40.5023979Z sudo: setrlimit(RLIMIT_STACK): Operation not permitted 2025-08-14T22:58:40.5024474Z + echo 'For more details refer to https://github.com/sudo-project/sudo/issues/42' 2025-08-14T22:58:40.5025028Z For more details refer to https://github.com/sudo-project/sudo/issues/42 2025-08-14T22:58:40.5025470Z + sudo chown -R 1000 /var/lib/jenkins/workspace 2025-08-14T22:58:41.5732316Z ##[group]Run pytorch/test-infra/.github/actions/upload-benchmark-results@main 2025-08-14T22:58:41.5732780Z with: 2025-08-14T22:58:41.5733023Z benchmark-results-dir: test/test-reports 2025-08-14T22:58:41.5733335Z dry-run: false 2025-08-14T22:58:41.5733567Z schema-version: v3 2025-08-14T22:58:41.5733996Z github-token: *** 2025-08-14T22:58:41.5734216Z env: 2025-08-14T22:58:41.5734427Z GIT_DEFAULT_BRANCH: main 2025-08-14T22:58:41.5734749Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T22:58:41.5735271Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T22:58:41.5735738Z ##[endgroup] 2025-08-14T22:58:41.5762586Z ##[group]Run set -eux 2025-08-14T22:58:41.5762847Z set -eux 2025-08-14T22:58:41.5763209Z python3 -mpip install boto3==1.35.33 psutil==7.0.0 pynvml==12.0.0 2025-08-14T22:58:41.5763652Z  2025-08-14T22:58:41.5763864Z DEVICE_NAME="" 2025-08-14T22:58:41.5764141Z DEVICE_TYPE="" 2025-08-14T22:58:41.5764475Z  2025-08-14T22:58:41.5764704Z if command -v nvidia-smi; then 2025-08-14T22:58:41.5765235Z  # NB: I'm using PyTorch here to get the device name, however, it needs to 2025-08-14T22:58:41.5765772Z  # install the correct version of PyTorch manually for now. Any PyTorch 2025-08-14T22:58:41.5766266Z  # version is fine, I just use 2.7.1 to satify PYPIDEP linter 2025-08-14T22:58:41.5766670Z  python3 -mpip install torch==2.7.1 2025-08-14T22:58:41.5767006Z elif command -v rocminfo; then 2025-08-14T22:58:41.5767415Z  # NB: Installing torch on ROCm runner with pip here causes CI to fail 2025-08-14T22:58:41.5767933Z  # with a memoryview is too large error only on MI300 runners. Is pip 2025-08-14T22:58:41.5768453Z  # version on ROCm runner there too old? As a workaround, let's use the 2025-08-14T22:58:41.5768919Z  # GPU device name coming from rocminfo instead 2025-08-14T22:58:41.5769255Z  DEVICE_NAME=rocm 2025-08-14T22:58:41.5769707Z  DEVICE_TYPE=$(rocminfo | grep "Marketing Name" | tail -n1 | awk -F':' '{print $2}' | xargs) 2025-08-14T22:58:41.5770171Z fi 2025-08-14T22:58:41.5770379Z  2025-08-14T22:58:41.5770637Z echo "DEVICE_NAME=$DEVICE_NAME" >> $GITHUB_ENV 2025-08-14T22:58:41.5771026Z echo "DEVICE_TYPE=$DEVICE_TYPE" >> $GITHUB_ENV 2025-08-14T22:58:41.5785954Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T22:58:41.5786299Z env: 2025-08-14T22:58:41.5786513Z GIT_DEFAULT_BRANCH: main 2025-08-14T22:58:41.5786835Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T22:58:41.5787350Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T22:58:41.5787911Z ##[endgroup] 2025-08-14T22:58:41.5837434Z + python3 -mpip install boto3==1.35.33 psutil==7.0.0 pynvml==12.0.0 2025-08-14T22:58:41.8981847Z Defaulting to user installation because normal site-packages is not writeable 2025-08-14T22:58:43.1080254Z Collecting boto3==1.35.33 2025-08-14T22:58:43.1265175Z Downloading boto3-1.35.33-py3-none-any.whl (139 kB) 2025-08-14T22:58:43.4683923Z Collecting psutil==7.0.0 2025-08-14T22:58:43.4717499Z Downloading psutil-7.0.0-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (277 kB) 2025-08-14T22:58:43.4994527Z Collecting pynvml==12.0.0 2025-08-14T22:58:43.5030244Z Downloading pynvml-12.0.0-py3-none-any.whl (26 kB) 2025-08-14T22:58:44.7815643Z Collecting botocore<1.36.0,>=1.35.33 2025-08-14T22:58:44.7849518Z Downloading botocore-1.35.99-py3-none-any.whl (13.3 MB) 2025-08-14T22:58:44.9245601Z Requirement already satisfied: jmespath<2.0.0,>=0.7.1 in /usr/lib/python3.9/site-packages (from boto3==1.35.33) (0.10.0) 2025-08-14T22:58:44.9673700Z Collecting s3transfer<0.11.0,>=0.10.0 2025-08-14T22:58:44.9704399Z Downloading s3transfer-0.10.4-py3-none-any.whl (83 kB) 2025-08-14T22:58:45.0212759Z Collecting nvidia-ml-py<13.0.0a0,>=12.0.0 2025-08-14T22:58:45.0243720Z Downloading nvidia_ml_py-12.575.51-py3-none-any.whl (47 kB) 2025-08-14T22:58:45.0346368Z Requirement already satisfied: urllib3<1.27,>=1.25.4 in /usr/lib/python3.9/site-packages (from botocore<1.36.0,>=1.35.33->boto3==1.35.33) (1.25.10) 2025-08-14T22:58:45.0352164Z Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /usr/lib/python3.9/site-packages (from botocore<1.36.0,>=1.35.33->boto3==1.35.33) (2.8.1) 2025-08-14T22:58:45.1974442Z Requirement already satisfied: six>=1.5 in /usr/lib/python3.9/site-packages (from python-dateutil<3.0.0,>=2.1->botocore<1.36.0,>=1.35.33->boto3==1.35.33) (1.15.0) 2025-08-14T22:58:45.3348603Z Installing collected packages: botocore, s3transfer, nvidia-ml-py, pynvml, psutil, boto3 2025-08-14T22:58:45.9021691Z Attempting uninstall: nvidia-ml-py 2025-08-14T22:58:45.9022738Z Found existing installation: nvidia-ml-py 11.525.84 2025-08-14T22:58:45.9037919Z Uninstalling nvidia-ml-py-11.525.84: 2025-08-14T22:58:45.9288297Z Successfully uninstalled nvidia-ml-py-11.525.84 2025-08-14T22:58:45.9904199Z Attempting uninstall: psutil 2025-08-14T22:58:45.9904986Z Found existing installation: psutil 5.9.8 2025-08-14T22:58:45.9993477Z Uninstalling psutil-5.9.8: 2025-08-14T22:58:46.0000643Z Successfully uninstalled psutil-5.9.8 2025-08-14T22:58:46.1719996Z Successfully installed boto3-1.35.33 botocore-1.35.99 nvidia-ml-py-12.575.51 psutil-7.0.0 pynvml-12.0.0 s3transfer-0.10.4 2025-08-14T22:58:46.2954088Z + DEVICE_NAME= 2025-08-14T22:58:46.2954335Z + DEVICE_TYPE= 2025-08-14T22:58:46.2954575Z + command -v nvidia-smi 2025-08-14T22:58:46.2954846Z + python3 -mpip install torch==2.7.1 2025-08-14T22:58:46.2955131Z /usr/bin/nvidia-smi 2025-08-14T22:58:46.5326714Z Defaulting to user installation because normal site-packages is not writeable 2025-08-14T22:58:46.8149190Z Collecting torch==2.7.1 2025-08-14T22:58:46.8309555Z Downloading torch-2.7.1-cp39-cp39-manylinux_2_28_x86_64.whl (821.1 MB) 2025-08-14T22:59:00.4209697Z Collecting nvidia-cuda-runtime-cu12==12.6.77 2025-08-14T22:59:00.4245254Z Downloading nvidia_cuda_runtime_cu12-12.6.77-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (897 kB) 2025-08-14T22:59:00.4825052Z Collecting nvidia-cudnn-cu12==9.5.1.17 2025-08-14T22:59:00.4857249Z Downloading nvidia_cudnn_cu12-9.5.1.17-py3-none-manylinux_2_28_x86_64.whl (571.0 MB) 2025-08-14T22:59:08.8980236Z Collecting fsspec 2025-08-14T22:59:08.9016309Z Downloading fsspec-2025.7.0-py3-none-any.whl (199 kB) 2025-08-14T22:59:08.9398579Z Collecting nvidia-cublas-cu12==12.6.4.1 2025-08-14T22:59:08.9432662Z Downloading nvidia_cublas_cu12-12.6.4.1-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (393.1 MB) 2025-08-14T22:59:14.5894518Z Collecting filelock 2025-08-14T22:59:14.5963466Z Downloading filelock-3.19.1-py3-none-any.whl (15 kB) 2025-08-14T22:59:14.6362428Z Collecting triton==3.3.1 2025-08-14T22:59:14.6395493Z Downloading triton-3.3.1-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (155.6 MB) 2025-08-14T22:59:16.2406138Z Collecting nvidia-cufile-cu12==1.11.1.6 2025-08-14T22:59:16.2439821Z Downloading nvidia_cufile_cu12-1.11.1.6-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (1.1 MB) 2025-08-14T22:59:16.2760463Z Collecting nvidia-cusparselt-cu12==0.6.3 2025-08-14T22:59:16.2794802Z Downloading nvidia_cusparselt_cu12-0.6.3-py3-none-manylinux2014_x86_64.whl (156.8 MB) 2025-08-14T22:59:17.7971468Z Collecting nvidia-cusparse-cu12==12.5.4.2 2025-08-14T22:59:17.8006803Z Downloading nvidia_cusparse_cu12-12.5.4.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (216.6 MB) 2025-08-14T22:59:20.3962325Z Collecting sympy>=1.13.3 2025-08-14T22:59:20.3994162Z Downloading sympy-1.14.0-py3-none-any.whl (6.3 MB) 2025-08-14T22:59:20.4998548Z Collecting nvidia-cusolver-cu12==11.7.1.2 2025-08-14T22:59:20.5032249Z Downloading nvidia_cusolver_cu12-11.7.1.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (158.2 MB) 2025-08-14T22:59:22.1049550Z Collecting nvidia-nvtx-cu12==12.6.77 2025-08-14T22:59:22.1079329Z Downloading nvidia_nvtx_cu12-12.6.77-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (89 kB) 2025-08-14T22:59:22.1115328Z Requirement already satisfied: jinja2 in /usr/lib/python3.9/site-packages (from torch==2.7.1) (2.11.3) 2025-08-14T22:59:22.1418523Z Collecting nvidia-cuda-nvrtc-cu12==12.6.77 2025-08-14T22:59:22.1450595Z Downloading nvidia_cuda_nvrtc_cu12-12.6.77-py3-none-manylinux2014_x86_64.whl (23.7 MB) 2025-08-14T22:59:22.3983195Z Collecting nvidia-cuda-cupti-cu12==12.6.80 2025-08-14T22:59:22.4048828Z Downloading nvidia_cuda_cupti_cu12-12.6.80-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (8.9 MB) 2025-08-14T22:59:22.5315178Z Collecting nvidia-nccl-cu12==2.26.2 2025-08-14T22:59:22.5349024Z Downloading nvidia_nccl_cu12-2.26.2-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (201.3 MB) 2025-08-14T22:59:24.7609285Z Collecting nvidia-cufft-cu12==11.3.0.4 2025-08-14T22:59:24.7652653Z Downloading nvidia_cufft_cu12-11.3.0.4-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (200.2 MB) 2025-08-14T22:59:27.0282597Z Collecting nvidia-curand-cu12==10.3.7.77 2025-08-14T22:59:27.0317791Z Downloading nvidia_curand_cu12-10.3.7.77-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (56.3 MB) 2025-08-14T22:59:27.6463831Z Collecting networkx 2025-08-14T22:59:27.6494775Z Downloading networkx-3.2.1-py3-none-any.whl (1.6 MB) 2025-08-14T22:59:27.7111533Z Collecting nvidia-nvjitlink-cu12==12.6.85 2025-08-14T22:59:27.7145822Z Downloading nvidia_nvjitlink_cu12-12.6.85-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl (19.7 MB) 2025-08-14T22:59:27.9110851Z Requirement already satisfied: typing-extensions>=4.10.0 in /home/ec2-user/.local/lib/python3.9/site-packages (from torch==2.7.1) (4.14.1) 2025-08-14T22:59:27.9387012Z Requirement already satisfied: setuptools>=40.8.0 in /usr/lib/python3.9/site-packages (from triton==3.3.1->torch==2.7.1) (59.6.0) 2025-08-14T22:59:27.9671567Z Collecting mpmath<1.4,>=1.1.0 2025-08-14T22:59:27.9704609Z Downloading mpmath-1.3.0-py3-none-any.whl (536 kB) 2025-08-14T22:59:28.0641080Z Requirement already satisfied: MarkupSafe>=0.23 in /usr/lib64/python3.9/site-packages (from jinja2->torch==2.7.1) (1.1.1) 2025-08-14T22:59:28.4223528Z Installing collected packages: nvidia-nvjitlink-cu12, nvidia-cusparse-cu12, nvidia-cublas-cu12, mpmath, triton, sympy, nvidia-nvtx-cu12, nvidia-nccl-cu12, nvidia-cusparselt-cu12, nvidia-cusolver-cu12, nvidia-curand-cu12, nvidia-cufile-cu12, nvidia-cufft-cu12, nvidia-cudnn-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, networkx, fsspec, filelock, torch 2025-08-14T22:59:37.8601981Z WARNING: The scripts proton and proton-viewer are installed in '/home/ec2-user/.local/bin' which is not on PATH. 2025-08-14T22:59:37.8604674Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2025-08-14T22:59:41.9554211Z WARNING: The script isympy is installed in '/home/ec2-user/.local/bin' which is not on PATH. 2025-08-14T22:59:41.9554960Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2025-08-14T23:00:13.1742743Z WARNING: The scripts torchfrtrace and torchrun are installed in '/home/ec2-user/.local/bin' which is not on PATH. 2025-08-14T23:00:13.1743689Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2025-08-14T23:00:13.6541575Z Successfully installed filelock-3.19.1 fsspec-2025.7.0 mpmath-1.3.0 networkx-3.2.1 nvidia-cublas-cu12-12.6.4.1 nvidia-cuda-cupti-cu12-12.6.80 nvidia-cuda-nvrtc-cu12-12.6.77 nvidia-cuda-runtime-cu12-12.6.77 nvidia-cudnn-cu12-9.5.1.17 nvidia-cufft-cu12-11.3.0.4 nvidia-cufile-cu12-1.11.1.6 nvidia-curand-cu12-10.3.7.77 nvidia-cusolver-cu12-11.7.1.2 nvidia-cusparse-cu12-12.5.4.2 nvidia-cusparselt-cu12-0.6.3 nvidia-nccl-cu12-2.26.2 nvidia-nvjitlink-cu12-12.6.85 nvidia-nvtx-cu12-12.6.77 sympy-1.14.0 torch-2.7.1 triton-3.3.1 2025-08-14T23:00:14.3412935Z + echo DEVICE_NAME= 2025-08-14T23:00:14.3413335Z + echo DEVICE_TYPE= 2025-08-14T23:00:14.3438622Z ##[group]Run set -eux 2025-08-14T23:00:14.3438890Z set -eux 2025-08-14T23:00:14.3439109Z  2025-08-14T23:00:14.3439339Z if [[ -z "${GITHUB_TOKEN}" ]]; then 2025-08-14T23:00:14.3439681Z  echo "Missing github-token input" 2025-08-14T23:00:14.3439986Z  exit 1 2025-08-14T23:00:14.3440197Z fi 2025-08-14T23:00:14.3452979Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T23:00:14.3453343Z env: 2025-08-14T23:00:14.3453561Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:14.3453890Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:14.3454417Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:14.3454941Z DEVICE_NAME: 2025-08-14T23:00:14.3455163Z DEVICE_TYPE: 2025-08-14T23:00:14.3455590Z GITHUB_TOKEN: *** 2025-08-14T23:00:14.3455833Z ##[endgroup] 2025-08-14T23:00:14.3491155Z + [[ -z *** ]] 2025-08-14T23:00:14.3629544Z ##[group]Run pytorch/test-infra/.github/actions/get-workflow-job-id@main 2025-08-14T23:00:14.3630084Z with: 2025-08-14T23:00:14.3630568Z github-token: *** 2025-08-14T23:00:14.3630793Z env: 2025-08-14T23:00:14.3631007Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:14.3631332Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:14.3631856Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:14.3632331Z DEVICE_NAME: 2025-08-14T23:00:14.3632557Z DEVICE_TYPE: 2025-08-14T23:00:14.3632780Z ##[endgroup] 2025-08-14T23:00:14.3748395Z ##[group]Run set -eux 2025-08-14T23:00:14.3748698Z set -eux 2025-08-14T23:00:14.3748926Z  2025-08-14T23:00:14.3749375Z python3 "${GITHUB_ACTION_PATH}/../../scripts/get_workflow_job_id.py" "${GITHUB_RUN_ID}" "${RUNNER_NAME}" 2025-08-14T23:00:14.3759108Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T23:00:14.3759470Z env: 2025-08-14T23:00:14.3759694Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:14.3760019Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:14.3760544Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:14.3761010Z DEVICE_NAME: 2025-08-14T23:00:14.3761238Z DEVICE_TYPE: 2025-08-14T23:00:14.3761593Z GITHUB_TOKEN: *** 2025-08-14T23:00:14.3761828Z ##[endgroup] 2025-08-14T23:00:14.3792659Z + python3 /home/ec2-user/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/get-workflow-job-id/../../scripts/get_workflow_job_id.py 16976255024 i-006ff0d0e48756a9a 2025-08-14T23:00:15.3015598Z setting job-id=48128162047 2025-08-14T23:00:15.3016342Z setting job-name=linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 3, 3, linux.g5.4xlarge.nvidia.gpu) 2025-08-14T23:00:15.3232031Z ##[group]Run set -eux 2025-08-14T23:00:15.3232301Z set -eux 2025-08-14T23:00:15.3232538Z  2025-08-14T23:00:15.3232909Z python3 "${GITHUB_ACTION_PATH}/../../scripts/benchmarks/gather_metadata.py" \ 2025-08-14T23:00:15.3233397Z  --schema-version "${SCHEMA_VERSION}" \ 2025-08-14T23:00:15.3233731Z  --repo "${REPO}" \ 2025-08-14T23:00:15.3234029Z  --head-branch "${HEAD_BRANCH}" \ 2025-08-14T23:00:15.3234343Z  --head-sha "${HEAD_SHA}" \ 2025-08-14T23:00:15.3234666Z  --workflow-id "${WORKFLOW_RUN_ID}" \ 2025-08-14T23:00:15.3235009Z  --run-attempt "${RUN_ATTEMPT}" \ 2025-08-14T23:00:15.3235324Z  --job-id "${JOB_ID}" \ 2025-08-14T23:00:15.3235613Z  --job-name "${JOB_NAME}" 2025-08-14T23:00:15.3245326Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T23:00:15.3245690Z env: 2025-08-14T23:00:15.3245903Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:15.3246225Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:15.3246954Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:15.3247839Z DEVICE_NAME: 2025-08-14T23:00:15.3248129Z DEVICE_TYPE: 2025-08-14T23:00:15.3248414Z SCHEMA_VERSION: v3 2025-08-14T23:00:15.3248713Z REPO: pytorch/pytorch 2025-08-14T23:00:15.3249023Z HEAD_BRANCH: refs/heads/main 2025-08-14T23:00:15.3249332Z HEAD_SHA: 1fc683cf17c8c673044538d10266c00f92987be2 2025-08-14T23:00:15.3249662Z WORKFLOW_RUN_ID: 16976255024 2025-08-14T23:00:15.3249918Z RUN_ATTEMPT: 1 2025-08-14T23:00:15.3250143Z JOB_ID: 48128162047 2025-08-14T23:00:15.3250593Z JOB_NAME: linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 3, 3, linux.g5.4xlarge.nvidia.gpu) 2025-08-14T23:00:15.3251078Z ##[endgroup] 2025-08-14T23:00:15.3289564Z + python3 /home/ec2-user/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/benchmarks/gather_metadata.py --schema-version v3 --repo pytorch/pytorch --head-branch refs/heads/main --head-sha 1fc683cf17c8c673044538d10266c00f92987be2 --workflow-id 16976255024 --run-attempt 1 --job-id 48128162047 --job-name 'linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 3, 3, linux.g5.4xlarge.nvidia.gpu)' 2025-08-14T23:00:15.3773205Z ##[group]Run set -eux 2025-08-14T23:00:15.3773475Z set -eux 2025-08-14T23:00:15.3773697Z  2025-08-14T23:00:15.3774063Z python3 "${GITHUB_ACTION_PATH}/../../scripts/benchmarks/gather_runners_info.py" 2025-08-14T23:00:15.3783244Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T23:00:15.3783595Z env: 2025-08-14T23:00:15.3783806Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:15.3784128Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:15.3784645Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:15.3785155Z DEVICE_NAME: 2025-08-14T23:00:15.3785381Z DEVICE_TYPE: 2025-08-14T23:00:15.3785596Z ##[endgroup] 2025-08-14T23:00:15.3816441Z + python3 /home/ec2-user/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/benchmarks/gather_runners_info.py 2025-08-14T23:00:16.3302296Z /home/ec2-user/.local/lib/python3.9/site-packages/torch/_subclasses/functional_tensor.py:276: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:81.) 2025-08-14T23:00:16.3303442Z cpu = _conversion_method_template(device=torch.device("cpu")) 2025-08-14T23:00:17.3638303Z ##[group]Run set -eux 2025-08-14T23:00:17.3638557Z set -eux 2025-08-14T23:00:17.3638767Z  2025-08-14T23:00:17.3639001Z # TODO (huydhn): Implement this part 2025-08-14T23:00:17.3639359Z echo "dependencies={}" >> "${GITHUB_OUTPUT}" 2025-08-14T23:00:17.3648869Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T23:00:17.3649220Z env: 2025-08-14T23:00:17.3649430Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:17.3649754Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:17.3650271Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:17.3650735Z DEVICE_NAME: 2025-08-14T23:00:17.3650958Z DEVICE_TYPE: 2025-08-14T23:00:17.3651177Z ##[endgroup] 2025-08-14T23:00:17.3682098Z + echo 'dependencies={}' 2025-08-14T23:00:17.3817239Z ##[group]Run set -eux 2025-08-14T23:00:17.3817522Z set -eux 2025-08-14T23:00:17.3817751Z  2025-08-14T23:00:17.3818015Z if [[ ! -d "${BENCHMARK_RESULTS_DIR}" ]]; then 2025-08-14T23:00:17.3818434Z  echo "${BENCHMARK_RESULTS_DIR} does not exist, skipping" 2025-08-14T23:00:17.3818889Z  # We don't want the job to fail if the directory doesn't exist 2025-08-14T23:00:17.3819280Z  exit 0 2025-08-14T23:00:17.3819508Z fi 2025-08-14T23:00:17.3819715Z  2025-08-14T23:00:17.3819954Z if [[ "${DRY_RUN}" == "true" ]]; then 2025-08-14T23:00:17.3820595Z  python3 "${GITHUB_ACTION_PATH}/../../scripts/upload_benchmark_results.py" \ 2025-08-14T23:00:17.3821124Z  --benchmark-results-dir "${BENCHMARK_RESULTS_DIR}" \ 2025-08-14T23:00:17.3821556Z  --metadata "${BENCHMARK_METADATA}" \ 2025-08-14T23:00:17.3821902Z  --runners "${RUNNER_INFO}" \ 2025-08-14T23:00:17.3822236Z  --dependencies "${DEPENDENCIES}" \ 2025-08-14T23:00:17.3822547Z  --dry-run 2025-08-14T23:00:17.3822791Z else 2025-08-14T23:00:17.3823159Z  python3 "${GITHUB_ACTION_PATH}/../../scripts/upload_benchmark_results.py" \ 2025-08-14T23:00:17.3823661Z  --benchmark-results-dir "${BENCHMARK_RESULTS_DIR}" \ 2025-08-14T23:00:17.3824062Z  --metadata "${BENCHMARK_METADATA}" \ 2025-08-14T23:00:17.3824402Z  --runners "${RUNNER_INFO}" \ 2025-08-14T23:00:17.3824737Z  --dependencies "${DEPENDENCIES}" 2025-08-14T23:00:17.3825199Z fi 2025-08-14T23:00:17.3834392Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T23:00:17.3834742Z env: 2025-08-14T23:00:17.3834951Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:17.3835417Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:17.3835930Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:17.3836380Z DEVICE_NAME: 2025-08-14T23:00:17.3836600Z DEVICE_TYPE: 2025-08-14T23:00:17.3836848Z BENCHMARK_RESULTS_DIR: test/test-reports 2025-08-14T23:00:17.3837147Z DRY_RUN: false 2025-08-14T23:00:17.3838327Z BENCHMARK_METADATA: {"timestamp": 1755212415, "schema_version": "v3", "name": "linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 3, 3, linux.g5.4xlarge.nvidia.gpu)", "repo": "pytorch/pytorch", "head_branch": "refs/heads/main", "head_sha": "1fc683cf17c8c673044538d10266c00f92987be2", "workflow_id": 16976255024, "run_attempt": 1, "job_id": 48128162047} 2025-08-14T23:00:17.3840003Z RUNNER_INFO: [{"cpu_info": "x86_64", "cpu_count": 16, "avail_mem_in_gb": 62, "extra_info": {"hostname": "ip-10-0-74-8.ec2.internal"}, "name": "cuda", "type": "NVIDIA A10G", "gpu_count": 1, "avail_gpu_mem_in_gb": 22}] 2025-08-14T23:00:17.3840735Z DEPENDENCIES: {} 2025-08-14T23:00:17.3840973Z ##[endgroup] 2025-08-14T23:00:17.3872331Z + [[ ! -d test/test-reports ]] 2025-08-14T23:00:17.3872651Z + [[ false == \t\r\u\e ]] 2025-08-14T23:00:17.3875237Z + python3 /home/ec2-user/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/upload_benchmark_results.py --benchmark-results-dir test/test-reports --metadata '{"timestamp": 1755212415, "schema_version": "v3", "name": "linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 3, 3, linux.g5.4xlarge.nvidia.gpu)", "repo": "pytorch/pytorch", "head_branch": "refs/heads/main", "head_sha": "1fc683cf17c8c673044538d10266c00f92987be2", "workflow_id": 16976255024, "run_attempt": 1, "job_id": 48128162047}' --runners '[{"cpu_info": "x86_64", "cpu_count": 16, "avail_mem_in_gb": 62, "extra_info": {"hostname": "ip-10-0-74-8.ec2.internal"}, "name": "cuda", "type": "NVIDIA A10G", "gpu_count": 1, "avail_gpu_mem_in_gb": 22}]' --dependencies '{}' 2025-08-14T23:00:17.5675024Z /home/ec2-user/actions-runner/_work/_actions/pytorch/test-infra/main/.github/actions/upload-benchmark-results/../../scripts/upload_benchmark_results.py:236: UserWarning: {'included': [{'test_file': 'inductor/test_compiled_optimizers'}, {'test_file': 'export/test_torchbind'}, {'test_file': 'test_testing'}, {'test_file': 'test_ci_sanity_check_fail'}, {'test_file': 'inductor/test_aot_inductor'}, {'test_file': 'test_ops'}, {'test_file': 'test_jit_fuser_te'}, {'test_file': 'test_public_bindings'}, {'test_file': 'test_linalg'}, {'test_file': 'test_cpp_extensions_jit'}, {'test_file': 'test_functional_autograd_benchmark'}, {'test_file': 'inductor/test_torchinductor'}, {'test_file': 'doctests'}, {'test_file': 'inductor/test_max_autotune'}, {'test_file': 'inductor/test_torchinductor_dynamic_shapes'}, {'test_file': 'inductor/test_torchinductor_codegen_dynamic_shapes'}, {'test_file': 'inductor/test_torchinductor_opinfo'}, {'test_file': 'inductor/test_cpu_repro'}, {'test_file': 'inductor/test_cuda_repro'}, {'test_file': 'inductor/test_kernel_benchmark'}, {'test_file': 'inductor/test_cudagraph_trees'}, {'test_file': 'dynamo/test_dynamic_shapes'}, {'test_file': 'dynamo/test_unspec'}, {'test_file': 'dynamo/test_repros'}, {'test_file': 'inductor/test_pattern_matcher'}, {'test_file': 'inductor/test_mkldnn_pattern_matcher'}, {'test_file': 'inductor/test_extension_backend'}, {'test_file': 'inductor/test_perf'}, {'test_file': 'inductor/test_foreach'}, {'test_file': 'dynamo/test_modules'}, {'test_file': 'inductor/test_triton_wrapper'}, {'test_file': 'inductor/test_coordinate_descent_tuner'}, {'test_file': 'inductor/test_smoke'}, {'test_file': 'inductor/test_select_algorithm'}, {'test_file': 'inductor/test_dependencies'}, {'test_file': 'inductor/test_compiled_autograd'}, {'test_file': 'inductor/test_fused_attention'}, {'test_file': 'inductor/test_group_batch_fusion'}, {'test_file': 'inductor/test_inductor_freezing'}, {'test_file': 'inductor/test_split_cat_fx_passes'}, {'test_file': 'inductor/test_minifier'}, {'test_file': 'inductor/test_snode_runtime'}, {'test_file': 'inductor/test_custom_lowering'}, {'test_file': 'inductor/test_config'}, {'test_file': 'dynamo/test_backends'}, {'test_file': 'dynamo/test_after_aot'}, {'test_file': 'dynamo/test_higher_order_ops'}, {'test_file': 'inductor/test_layout_optim'}, {'test_file': 'inductor/test_binary_folding'}, {'test_file': 'dynamo/test_logging'}, {'test_file': 'dynamo/test_activation_checkpointing'}, {'test_file': 'inductor/test_mmdecomp'}, {'test_file': 'dynamo/test_cudagraphs'}, {'test_file': 'dynamo/test_misc'}, {'test_file': 'dynamo/test_aot_autograd'}, {'test_file': 'dynamo/test_ctx_manager'}, {'test_file': 'dynamo/test_exc'}, {'test_file': 'inductor/test_flex_attention'}, {'test_file': 'inductor/test_control_flow'}, {'test_file': 'inductor/test_aot_inductor_arrayref'}, {'test_file': 'inductor/test_padding'}, {'test_file': 'inductor/test_torchinductor_strided_blocks'}, {'test_file': 'inductor/test_triton_kernels'}, {'test_file': 'inductor/test_halide'}, {'test_file': 'inductor/test_triton_cpu_backend'}, {'test_file': 'inductor/test_aot_inductor_custom_ops'}, {'test_file': 'inductor/test_cpu_select_algorithm'}, {'test_file': 'inductor/test_flex_decoding'}, {'test_file': 'inductor/test_unbacked_symints'}, {'test_file': 'inductor/test_external_callables'}, {'test_file': 'inductor/test_provenance_tracing'}, {'test_file': 'inductor/test_benchmark_fusion'}, {'test_file': 'inductor/test_alignment'}, {'test_file': 'inductor/test_triton_syntax'}, {'test_file': 'inductor/test_subgraph_choice'}, {'test_file': 'inductor/test_cpp_wrapper_hipify'}, {'test_file': 'inductor/test_pad_mm'}, {'test_file': 'inductor/test_multi_kernel'}, {'test_file': 'export/test_export'}, {'test_file': 'inductor/test_inplace_padding'}, {'test_file': 'inductor/test_cutlass_backend'}, {'test_file': 'dynamo/test_deque_reconstruct'}, {'test_file': 'inductor/test_torchbind'}, {'test_file': 'inductor/test_memory'}, {'test_file': 'export/test_retraceability'}, {'test_file': 'inductor/test_cooperative_reductions'}, {'test_file': 'inductor/test_triton_extension_backend'}, {'test_file': 'inductor/test_analysis'}, {'test_file': 'inductor/test_inductor_annotations'}, {'test_file': 'inductor/test_ck_backend'}, {'test_file': 'inductor/test_inductor_utils'}, {'test_file': 'inductor/test_remote_cache'}, {'test_file': 'export/test_functionalized_assertions'}, {'test_file': 'inductor/test_op_completeness'}, {'test_file': 'inductor/test_indexing'}, {'test_file': 'inductor/test_fp8'}, {'test_file': 'inductor/test_fx_fusion'}, {'test_file': 'inductor/test_debug_trace'}, {'test_file': 'inductor/test_online_softmax'}, {'test_file': 'export/test_export_strict'}, {'test_file': 'inductor/test_async_compile'}, {'test_file': 'export/test_export_training_ir_to_run_decomp'}, {'test_file': 'inductor/test_minifier_utils'}, {'test_file': 'export/test_unflatten_training_ir'}, {'test_file': 'export/test_tree_utils'}, {'test_file': 'inductor/test_combo_kernels'}, {'test_file': 'inductor/test_b2b_gemm'}, {'test_file': 'dynamo/test_skip_guard_eval_unsafe'}, {'test_file': 'dynamo/test_functions'}, {'test_file': 'inductor/test_memory_planning'}, {'test_file': 'dynamo/test_interop'}, {'test_file': 'inductor/test_compile_worker'}, {'test_file': 'dynamo/test_buffers_override'}, {'test_file': 'dynamo/test_frame_init'}, {'test_file': 'dynamo/test_base_output'}, {'test_file': 'dynamo/test_fx_passes_pre_grad'}, {'test_file': 'export/test_serdes'}, {'test_file': 'inductor/test_utils'}, {'test_file': 'inductor/test_metrics'}, {'test_file': 'inductor/test_cpu_cpp_wrapper'}, {'test_file': 'inductor/test_move_constructors_to_cuda'}, {'test_file': 'inductor/test_autoheuristic'}, {'test_file': 'dynamo/test_inline_and_install'}, {'test_file': 'inductor/test_aot_inductor_package'}, {'test_file': 'dynamo/test_nops'}, {'test_file': 'dynamo/test_skip_non_tensor'}, {'test_file': 'inductor/test_gpu_cpp_wrapper'}, {'test_file': 'inductor/test_cutlass_evt'}, {'test_file': 'dynamo/test_recompiles'}, {'test_file': 'inductor/test_kernel_optimization'}, {'test_file': 'inductor/test_codegen_triton'}, {'test_file': 'export/test_export_with_inline_and_install'}, {'test_file': 'inductor/test_cudacodecache'}, {'test_file': 'inductor/test_static_cuda_launcher'}, {'test_file': 'inductor/test_op_dtype_prop'}, {'test_file': 'dynamo/test_pre_dispatch'}, {'test_file': 'dynamo/test_resume'}, {'test_file': 'dynamo/test_sdpa'}, {'test_file': 'dynamo/test_metrics_context'}, {'test_file': 'inductor/test_inplacing_pass'}, {'test_file': 'dynamo/test_global'}, {'test_file': 'dynamo/test_graph_region_tracker'}, {'test_file': 'inductor/test_custom_post_grad_passes'}, {'test_file': 'inductor/test_fuzzer'}, {'test_file': 'export/test_tools'}, {'test_file': 'inductor/test_best_config'}, {'test_file': 'dynamo/test_subgraphs'}, {'test_file': 'inductor/test_triton_heuristics'}, {'test_file': 'test_custom_ops'}, {'test_file': 'inductor/test_quantization'}, {'test_file': 'inductor/test_block_analysis'}, {'test_file': 'inductor/test_profiler'}, {'test_file': 'inductor/test_compile_subprocess'}, {'test_file': 'inductor/test_xpu_basic'}, {'test_file': 'dynamo/test_view'}, {'test_file': 'inductor/test_helion_kernels'}, {'test_file': 'inductor/test_codecache'}, {'test_file': 'dynamo/test_list'}, {'test_file': 'inductor/test_aot_inductor_utils'}, {'test_file': 'inductor/test_loop_ordering'}, {'test_file': 'dynamo/test_utils'}, {'test_file': 'inductor/test_ordered_set'}, {'test_file': 'inductor/test_fxir_backend'}, {'test_file': 'inductor/test_decompose_mem_bound_mm'}, {'test_file': 'dynamo/test_dicts'}, {'test_file': 'dynamo/test_install_free_tensors'}, {'test_file': 'dynamo/test_fx_graph_runnable'}, {'test_file': 'dynamo/test_autograd_function'}, {'test_file': 'export/test_cpp_serdes'}, {'test_file': 'dynamo/test_config'}, {'test_file': 'export/test_package'}, {'test_file': 'inductor/test_benchmarking'}, {'test_file': 'inductor/test_needs_exact_strides'}, {'test_file': 'dynamo/test_reconstruct'}, {'test_file': 'inductor/test_inductor_scheduler'}, {'test_file': 'dynamo/test_compile'}, {'test_file': 'inductor/test_compile'}, {'test_file': 'export/test_db'}, {'test_file': 'inductor/test_graph_transform_observer'}, {'test_file': 'dynamo/test_export_mutations'}, {'test_file': 'dynamo/test_unittest'}, {'test_file': 'dynamo/test_pgo'}, {'test_file': 'dynamo/test_sources'}, {'test_file': 'dynamo/test_graph_deduplication'}, {'test_file': 'inductor/test_torchinductor_codegen_config_overrides'}, {'test_file': 'dynamo/test_flat_apply'}, {'test_file': 'dynamo/test_guard_serialization'}, {'test_file': 'dynamo/test_modes'}, {'test_file': 'inductor/test_scatter_optimization'}, {'test_file': 'inductor/test_split_cat_fx_aten_passes'}, {'test_file': 'export/test_passes'}, {'test_file': 'export/test_unflatten'}, {'test_file': 'export/test_upgrader'}, {'test_file': 'export/test_swap'}, {'test_file': 'export/test_schema'}, {'test_file': 'dynamo/test_decorators'}, {'test_file': 'dynamo/test_einops'}, {'test_file': 'dynamo/test_backward_higher_order_ops'}, {'test_file': 'export/test_nativert'}, {'test_file': 'inductor/test_minifier_isolate'}, {'test_file': 'inductor/test_auto_functionalize'}, {'test_file': 'dynamo/test_verify_correctness'}, {'test_file': 'dynamo/test_export'}, {'test_file': 'dynamo/test_model_output'}, {'test_file': 'dynamo/test_sets'}, {'test_file': 'dynamo/test_base_hop'}, {'test_file': 'dynamo/test_profiler'}, {'test_file': 'test_content_store'}, {'test_file': 'dynamo/test_deviceguard'}, {'test_file': 'dynamo/test_optimizers'}, {'test_file': 'dynamo/test_debug_utils'}, {'test_file': 'dynamo/test_python_dispatcher'}, {'test_file': 'export/test_lift_unlift'}, {'test_file': 'dynamo/test_compiler_bisector'}, {'test_file': 'test_model_exports_to_core_aten'}, {'test_file': 'dynamo/test_package'}, {'test_file': 'export/test_experimental'}, {'test_file': 'dynamo/test_input_attr_tracking'}, {'test_file': 'dynamo/test_bytecode_utils'}, {'test_file': 'export/test_hop'}, {'test_file': 'export/test_converter'}, {'test_file': 'dynamo/test_error_messages'}, {'test_file': 'dynamo/test_trace_rules'}, {'test_file': 'dynamo/test_callback'}, {'test_file': 'export/test_pass_infra'}, {'test_file': 'inductor/test_mps_basic'}, {'test_file': 'dynamo/test_exceptions'}, {'test_file': 'dynamo/test_aot_autograd_cache'}, {'test_file': 'inductor/test_efficient_conv_bn_eval'}, {'test_file': 'dynamo/test_precompile_context'}, {'test_file': 'dynamo/test_cudagraphs_expandable_segments'}, {'test_file': 'dynamo/test_guard_manager'}, {'test_file': 'test_sparse_semi_structured'}, {'test_file': 'export/test_serialize'}, {'test_file': 'dynamo/test_structured_trace'}, {'test_file': 'dynamo/test_torchrec'}, {'test_file': 'export/test_verifier'}, {'test_file': 'dynamo/test_recompile_ux'}, {'test_file': 'dynamo/test_minifier'}, {'test_file': 'inductor/test_cudagraph_trees_expandable_segments'}, {'test_file': 'test_functionalization_of_rng_ops'}, {'test_file': 'dynamo/test_fake_distributed'}, {'test_file': 'dynamo/test_hooks'}, {'test_file': 'inductor/test_distributed_patterns'}, {'test_file': 'dynamo/test_generator'}, {'test_file': 'dynamo/test_reorder_logs'}, {'test_file': 'dynamo/test_subclasses'}, {'test_file': 'export/test_sparse'}, {'test_file': 'functorch/test_eager_transforms'}, {'test_file': 'export/test_draft_export'}, {'test_file': 'dynamo/test_comptime'}, {'test_file': 'dynamo/test_python_autograd'}, {'test_file': 'test_quantization'}, {'test_file': 'test_dynamic_shapes'}, {'test_file': 'higher_order_ops/test_invoke_subgraph'}, {'test_file': 'test_package'}, {'test_file': 'functorch/test_rearrange'}, {'test_file': 'functorch/test_parsing'}, {'test_file': 'test_hop_infra'}, {'test_file': 'test_comparison_utils'}, {'test_file': 'functorch/test_control_flow'}, {'test_file': 'functorch/test_ac_logging'}, {'test_file': 'test_mkl_verbose'}, {'test_file': 'test_appending_byte_serializer'}, {'test_file': 'test_cpp_api_parity'}, {'test_file': 'test_autoload'}, {'test_file': 'test_proxy_tensor'}, {'test_file': 'test_mkldnn_verbose'}, {'test_file': 'test_utils_config_module'}, {'test_file': 'test_functionalization'}, {'test_file': 'test_rename_privateuse1_to_existing_device'}, {'test_file': 'test_transformers_privateuse1'}, {'test_file': 'test_transformers'}, {'test_file': 'test_extension_utils'}, {'test_file': 'test_ao_sparsity'}, {'test_file': 'test_license'}, {'test_file': 'torch_np/test_binary_ufuncs'}, {'test_file': 'xpu/test_fusion'}, {'test_file': 'torch_np/test_unary_ufuncs'}, {'test_file': 'test_foreach'}, {'test_file': 'test_expanded_weights'}, {'test_file': 'typing/test_python_operators'}, {'test_file': 'test_fx_experimental'}, {'test_file': 'test_file_check'}, {'test_file': 'torch_np/test_dtype'}, {'test_file': 'higher_order_ops/test_with_effects'}, {'test_file': 'test_native_functions'}, {'test_file': 'functorch/test_minifier'}, {'test_file': 'test_flop_counter'}, {'test_file': 'torch_np/test_nep50_examples'}, {'test_file': 'higher_order_ops/test_invoke_quant'}, {'test_file': 'test_per_overload_api'}, {'test_file': 'test_tensorexpr_pybind'}, {'test_file': 'test_pytree'}, {'test_file': 'test_fx_passes'}, {'test_file': 'test_fx'}, {'test_file': 'test_fake_tensor'}, {'test_file': 'test_module_tracker'}, {'test_file': 'test_typing'}, {'test_file': 'test_openmp'}, {'test_file': 'torch_np/numpy_tests/core/test_scalarinherit'}, {'test_file': 'functorch/test_ac_knapsack'}, {'test_file': 'backends/xeon/test_launch'}, {'test_file': 'test_nestedtensor'}, {'test_file': 'test_namedtensor'}, {'test_file': 'test_openreg'}, {'test_file': 'xpu/test_gemm'}, {'test_file': 'torch_np/test_ufuncs_basic'}, {'test_file': 'test_meta'}, {'test_file': 'test_show_pickle'}, {'test_file': 'test_utils_filelock'}, {'test_file': 'torch_np/test_random'}, {'test_file': 'profiler/test_kineto'}, {'test_file': 'test_tensorexpr'}, {'test_file': 'test_compile_benchmark_util'}, {'test_file': 'test_jiterator'}, {'test_file': 'test_optim'}, {'test_file': 'test_decomp'}, {'test_file': 'torch_np/test_basic'}, {'test_file': 'lazy/test_bindings'}, {'test_file': 'test_legacy_vmap'}, {'test_file': 'test_modules'}, {'test_file': 'test_binary_ufuncs'}, {'test_file': 'test_torch'}, {'test_file': 'test_matmul_cuda'}, {'test_file': 'test_weak'}, {'test_file': 'functorch/test_logging'}, {'test_file': 'test_logging'}, {'test_file': 'test_out_dtype_op'}, {'test_file': 'test_complex'}, {'test_file': 'xpu/test_conv'}, {'test_file': 'cpp_extensions/python_agnostic_extension/test/test_python_agnostic'}, {'test_file': 'torch_np/numpy_tests/core/test_einsum'}, {'test_file': 'test_ops_jit'}, {'test_file': 'lazy/test_step_closures'}, {'test_file': 'test_ops_gradients'}, {'test_file': 'test_fx_reinplace_pass'}, {'test_file': 'nn/test_multihead_attention'}, {'test_file': 'functorch/test_aot_joint_with_descriptors'}, {'test_file': 'test_pruning_op'}, {'test_file': 'lazy/test_functionalization'}, {'test_file': 'test_utils'}, {'test_file': 'functorch/test_ac'}, {'test_file': 'test_python_dispatch'}, {'test_file': 'test_ops_fwd_gradients'}, {'test_file': 'test_hub'}, {'test_file': 'test_set_default_mobile_cpu_allocator'}, {'test_file': 'test_cuda_expandable_segments'}, {'test_file': 'test_stateless'}, {'test_file': 'test_numpy_interop'}, {'test_file': 'test_dispatch'}, {'test_file': 'test_namedtuple_return_api'}, {'test_file': 'profiler/test_record_function'}, {'test_file': 'test_multiprocessing'}, {'test_file': 'test_autograd_fallback'}, {'test_file': 'test_type_hints'}, {'test_file': 'nn/test_lazy_modules'}, {'test_file': 'test_segment_reductions'}, {'test_file': 'test_import_stats'}, {'test_file': 'test_masked'}, {'test_file': 'test_cuda_sanitizer'}, {'test_file': 'test_monitor'}, {'test_file': 'profiler/test_cpp_thread'}, {'test_file': 'test_subclass'}, {'test_file': 'profiler/test_profiler'}, {'test_file': 'functorch/test_vmap_registrations'}, {'test_file': 'nn/test_parametrization'}, {'test_file': 'nn/test_pruning'}, {'test_file': 'functorch/test_dims'}, {'test_file': 'test_itt'}, {'test_file': 'optim/test_lrscheduler'}, {'test_file': 'torch_np/numpy_tests/core/test_multiarray'}, {'test_file': 'test_numba_integration'}, {'test_file': 'test_cpp_extensions_mtia_backend'}, {'test_file': 'test_jit_disabled'}, {'test_file': 'torch_np/numpy_tests/core/test_numeric'}, {'test_file': 'test_sympy_utils'}, {'test_file': 'functorch/test_memory_efficient_fusion'}, {'test_file': 'torch_np/test_indexing'}, {'test_file': 'torch_np/numpy_tests/lib/test_function_base'}, {'test_file': 'test_jit'}, {'test_file': 'nn/test_packed_sequence'}, {'test_file': 'cpp_extensions/libtorch_agnostic_extension/test/test_libtorch_agnostic'}, {'test_file': 'test_tensor_creation_ops'}, {'test_file': 'functorch/test_ops'}, {'test_file': 'functorch/test_aotdispatch'}, {'test_file': 'test_schema_check'}, {'test_file': 'test_datapipe'}, {'test_file': 'test_autocast'}, {'test_file': 'test_bundled_inputs'}, {'test_file': 'profiler/test_execution_trace'}, {'test_file': 'test_cpp_extensions_stream_and_event'}, {'test_file': 'lazy/test_generator'}, {'test_file': 'torch_np/numpy_tests/core/test_indexing'}, {'test_file': 'test_maskedtensor'}, {'test_file': 'test_tensorboard'}, {'test_file': 'torch_np/numpy_tests/lib/test_type_check'}, {'test_file': 'lazy/test_ts_opinfo'}, {'test_file': 'test_mkldnn_fusion'}, {'test_file': 'benchmark_utils/test_benchmark_utils'}, {'test_file': 'optim/test_swa_utils'}, {'test_file': 'test_futures'}, {'test_file': 'torch_np/numpy_tests/lib/test_histograms'}, {'test_file': 'test_cuda'}, {'test_file': 'optim/test_optim'}, {'test_file': 'nn/test_embedding'}, {'test_file': 'nn/test_dropout'}, {'test_file': 'test_autograd'}, {'test_file': 'torch_np/numpy_tests/core/test_scalarmath'}, {'test_file': 'test_functional_optim'}, {'test_file': 'test_vulkan'}, {'test_file': 'functorch/test_vmap'}, {'test_file': 'test_cuda_trace'}, {'test_file': 'nn/test_load_state_dict'}, {'test_file': 'test_reductions'}, {'test_file': 'test_shape_ops'}, {'test_file': 'torch_np/numpy_tests/core/test_dtype'}, {'test_file': 'torch_np/numpy_tests/core/test_shape_base'}, {'test_file': 'test_type_info'}, {'test_file': 'test_dataloader'}, {'test_file': 'test_jit_llga_fuser'}, {'test_file': 'test_indexing'}, {'test_file': 'torch_np/test_function_base'}, {'test_file': 'test_native_mha'}, {'test_file': 'test_sort_and_select'}, {'test_file': 'torch_np/numpy_tests/lib/test_twodim_base'}, {'test_file': 'torch_np/test_scalars_0D_arrays'}, {'test_file': 'nn/test_module_hooks'}, {'test_file': 'test_numa_binding'}, {'test_file': 'torch_np/numpy_tests/core/test_getlimits'}, {'test_file': 'torch_np/test_ndarray_methods'}, {'test_file': 'lazy/test_debug_util'}, {'test_file': 'test_cuda_nvml_based_avail'}, {'test_file': 'torch_np/numpy_tests/lib/test_shape_base_'}, {'test_file': 'torch_np/numpy_tests/linalg/test_linalg'}, {'test_file': 'test_sparse_csr'}, {'test_file': 'test_view_ops'}, {'test_file': 'test_accelerator'}, {'test_file': 'profiler/test_memory_profiler'}, {'test_file': 'test_nn'}, {'test_file': 'torch_np/numpy_tests/fft/test_pocketfft'}, {'test_file': 'test_function_schema'}, {'test_file': 'torch_np/numpy_tests/lib/test_index_tricks'}, {'test_file': 'test_scatter_gather_ops'}, {'test_file': 'test_serialization'}, {'test_file': 'torch_np/numpy_tests/core/test_dlpack'}, {'test_file': 'test_xnnpack_integration'}, {'test_file': 'torch_np/numpy_tests/fft/test_helper'}, {'test_file': 'test_mkldnn'}, {'test_file': 'test_overrides'}, {'test_file': 'test_unary_ufuncs'}, {'test_file': 'nn/test_pooling'}, {'test_file': 'torch_np/numpy_tests/core/test_scalar_methods'}, {'test_file': 'nn/test_init'}, {'test_file': 'test_cuda_primary_ctx'}, {'test_file': 'torch_np/numpy_tests/core/test_numerictypes'}, {'test_file': 'test_mobile_optimizer'}, {'test_file': 'torch_np/numpy_tests/lib/test_arraypad'}, {'test_file': 'test_cuda_multigpu'}, {'test_file': 'test_type_promotion'}, {'test_file': 'test_multiprocessing_spawn'}, {'test_file': 'torch_np/numpy_tests/lib/test_arraysetops'}, {'test_file': 'nn/test_convolution'}, {'test_file': 'test_sparse'}, {'test_file': 'test_jit_autocast'}, {'test_file': 'test_dlpack'}, {'test_file': 'profiler/test_profiler_tree'}, {'test_file': 'torch_np/numpy_tests/core/test_scalar_ctors'}, {'test_file': 'profiler/test_torch_tidy'}, {'test_file': 'torch_np/test_reductions'}, {'test_file': 'lazy/test_reuse_ir'}, {'test_file': 'test_prims'}, {'test_file': 'test_spectral_ops'}, {'test_file': 'profiler/test_python_tracer'}, {'test_file': 'distributions/test_distributions'}, {'test_file': 'test_autoload_enable'}, {'test_file': 'test_cpp_extensions_aot_no_ninja'}, {'test_file': 'test_autoload_disable'}, {'test_file': 'test_cpp_extensions_aot_ninja'}], 'excluded': []} from test/test-reports/td_exclusions-b4313c8c3a1c4e265e3a.json is not a benchmark record, skipping 2025-08-14T23:00:17.5733900Z warn(f"{result} from {filepath} is not a benchmark record, skipping") 2025-08-14T23:00:17.5860289Z ##[group]Run cat test/**/*_toprint.log || true 2025-08-14T23:00:17.5860691Z cat test/**/*_toprint.log || true 2025-08-14T23:00:17.5869766Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T23:00:17.5870113Z env: 2025-08-14T23:00:17.5870332Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:17.5870650Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:17.5871164Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:17.5871618Z DEVICE_NAME: 2025-08-14T23:00:17.5871836Z DEVICE_TYPE: 2025-08-14T23:00:17.5872049Z ##[endgroup] 2025-08-14T23:00:17.5988551Z cat: 'test/**/*_toprint.log': No such file or directory 2025-08-14T23:00:17.6047811Z ##[group]Run kill "$MONITOR_SCRIPT_PID" 2025-08-14T23:00:17.6048164Z kill "$MONITOR_SCRIPT_PID" 2025-08-14T23:00:17.6057195Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T23:00:17.6057558Z env: 2025-08-14T23:00:17.6057778Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:17.6058102Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:17.6058633Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:17.6059099Z DEVICE_NAME: 2025-08-14T23:00:17.6059323Z DEVICE_TYPE: 2025-08-14T23:00:17.6059740Z MONITOR_SCRIPT_PID: 62541 2025-08-14T23:00:17.6059998Z ##[endgroup] 2025-08-14T23:00:17.6211373Z Prepare all required actions 2025-08-14T23:00:17.6211745Z Getting action download info 2025-08-14T23:00:17.7721389Z Download action repository 'seemethere/upload-artifact-s3@v5' (SHA:baba72d0712b404f646cebe0730933554ebce96a) 2025-08-14T23:00:18.4360340Z Download action repository 'actions/upload-artifact@v4' (SHA:ea165f8d65b6e75b540449e92b4886f43607fa02) 2025-08-14T23:00:20.4608342Z ##[group]Run ./.github/actions/upload-test-artifacts 2025-08-14T23:00:20.4608674Z with: 2025-08-14T23:00:20.4609003Z file-suffix: test-slow-3-3-linux.g5.4xlarge.nvidia.gpu_48128162047 2025-08-14T23:00:20.4609404Z s3-bucket: gha-artifacts 2025-08-14T23:00:20.4609663Z env: 2025-08-14T23:00:20.4609869Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:20.4610188Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:20.4610693Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:20.4611147Z DEVICE_NAME: 2025-08-14T23:00:20.4611387Z DEVICE_TYPE: 2025-08-14T23:00:20.4611606Z ##[endgroup] 2025-08-14T23:00:20.4726084Z ##[group]Run # Remove any previous test jsons if they exist 2025-08-14T23:00:20.4726512Z # Remove any previous test jsons if they exist 2025-08-14T23:00:20.4726859Z rm -f test-jsons-*.zip 2025-08-14T23:00:20.4727267Z zip -r "test-jsons-${FILE_SUFFIX}.zip" test/test-reports -i '*.json' 2025-08-14T23:00:20.4736435Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T23:00:20.4736787Z env: 2025-08-14T23:00:20.4736991Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:20.4737311Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:20.4737951Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:20.4738408Z DEVICE_NAME: 2025-08-14T23:00:20.4738629Z DEVICE_TYPE: 2025-08-14T23:00:20.4738962Z FILE_SUFFIX: test-slow-3-3-linux.g5.4xlarge.nvidia.gpu_48128162047 2025-08-14T23:00:20.4739341Z ##[endgroup] 2025-08-14T23:00:20.5528135Z adding: test/test-reports/td_exclusions-b4313c8c3a1c4e265e3a.json (deflated 82%) 2025-08-14T23:00:20.5616720Z ##[group]Run # Remove any previous test reports if they exist 2025-08-14T23:00:20.5617162Z # Remove any previous test reports if they exist 2025-08-14T23:00:20.5617525Z rm -f test-reports-*.zip 2025-08-14T23:00:20.5617971Z zip -r "test-reports-${FILE_SUFFIX}.zip" test/test-reports -i '*.xml' -i '*.csv' 2025-08-14T23:00:20.5626807Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T23:00:20.5627163Z env: 2025-08-14T23:00:20.5627379Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:20.5627703Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:20.5628220Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:20.5628681Z DEVICE_NAME: 2025-08-14T23:00:20.5628898Z DEVICE_TYPE: 2025-08-14T23:00:20.5629251Z FILE_SUFFIX: test-slow-3-3-linux.g5.4xlarge.nvidia.gpu_48128162047 2025-08-14T23:00:20.5629645Z ##[endgroup] 2025-08-14T23:00:20.5858295Z adding: test/test-reports/python-pytest/export.test_torchbind/export.test_torchbind-8d98fca8ce52e28b.xml (deflated 28%) 2025-08-14T23:00:20.5859240Z adding: test/test-reports/python-pytest/export.test_torchbind/export.test_torchbind-3291203cb383cf37.xml (deflated 93%) 2025-08-14T23:00:20.5860070Z adding: test/test-reports/python-pytest/test_testing/test_testing-644a7d86d7425f95.xml (deflated 27%) 2025-08-14T23:00:20.5908934Z adding: test/test-reports/python-pytest/test_testing/test_testing-b94f7fdfef85d577.xml (deflated 98%) 2025-08-14T23:00:20.5910671Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-e055b6ce95cccc25.xml (deflated 28%) 2025-08-14T23:00:20.5933245Z adding: test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-036e070e5478acd1.xml (deflated 96%) 2025-08-14T23:00:20.5934478Z adding: test/test-reports/python-pytest/test_jit_fuser_te/test_jit_fuser_te-9d6175ae5ac79ead.xml (deflated 28%) 2025-08-14T23:00:20.6072749Z adding: test/test-reports/python-pytest/test_jit_fuser_te/test_jit_fuser_te-c9920704a0e5a892.xml (deflated 98%) 2025-08-14T23:00:20.6073615Z adding: test/test-reports/python-pytest/test_public_bindings/test_public_bindings-9b59939a5e74c38e.xml (deflated 28%) 2025-08-14T23:00:20.6074496Z adding: test/test-reports/python-pytest/test_public_bindings/test_public_bindings-f3588d4656a7f4f9.xml (deflated 77%) 2025-08-14T23:00:20.6075302Z adding: test/test-reports/python-pytest/test_linalg/test_linalg-f6575d2379a28214.xml (deflated 87%) 2025-08-14T23:00:20.6102196Z adding: test/test-reports/python-pytest/test_linalg/test_linalg-c7caf4bce7faa7cd.xml (deflated 98%) 2025-08-14T23:00:20.6103207Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_dynamic_shapes/inductor.test_torchinductor_dynamic_shapes-669f6a14eb10c686.xml (deflated 66%) 2025-08-14T23:00:20.6125298Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_dynamic_shapes/inductor.test_torchinductor_dynamic_shapes-67a12e4292cafa79.xml (deflated 96%) 2025-08-14T23:00:20.6126760Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_codegen_dynamic_shapes/inductor.test_torchinductor_codegen_dynamic_shapes-bc38f492ba570baa.xml (deflated 28%) 2025-08-14T23:00:20.6129285Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_codegen_dynamic_shapes/inductor.test_torchinductor_codegen_dynamic_shapes-e0342efcfc8eb521.xml (deflated 66%) 2025-08-14T23:00:20.6150373Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_codegen_dynamic_shapes/inductor.test_torchinductor_codegen_dynamic_shapes-413cc6b4fd53f42a.xml (deflated 96%) 2025-08-14T23:00:20.6173375Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_codegen_dynamic_shapes/inductor.test_torchinductor_codegen_dynamic_shapes-ec22e49769b42691.xml (deflated 97%) 2025-08-14T23:00:20.6174628Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-cc344b8d982db35d.xml (deflated 28%) 2025-08-14T23:00:20.6175755Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-f92a34a6747eadc1.xml (deflated 28%) 2025-08-14T23:00:20.6176867Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-0732759d01b80abb.xml (deflated 28%) 2025-08-14T23:00:20.6177981Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-c069c44c9245d821.xml (deflated 27%) 2025-08-14T23:00:20.6183099Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-7a6fd6830cb2633f.xml (deflated 97%) 2025-08-14T23:00:20.6193328Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-2ffd155d1588b65f.xml (deflated 98%) 2025-08-14T23:00:20.6201054Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-0d342f40c0a6729f.xml (deflated 98%) 2025-08-14T23:00:20.6209129Z adding: test/test-reports/python-pytest/inductor.test_torchinductor_opinfo/inductor.test_torchinductor_opinfo-708a0368fce102fd.xml (deflated 98%) 2025-08-14T23:00:20.6210191Z adding: test/test-reports/python-pytest/inductor.test_cpu_repro/inductor.test_cpu_repro-bc0959669641b78c.xml (deflated 28%) 2025-08-14T23:00:20.6228470Z adding: test/test-reports/python-pytest/inductor.test_cpu_repro/inductor.test_cpu_repro-dbf1e4b661d2a345.xml (deflated 98%) 2025-08-14T23:00:20.6229439Z adding: test/test-reports/python-pytest/inductor.test_cuda_repro/inductor.test_cuda_repro-1804bb19eaf29dc2.xml (deflated 28%) 2025-08-14T23:00:20.6231053Z adding: test/test-reports/python-pytest/inductor.test_cuda_repro/inductor.test_cuda_repro-26deeb2dce72c89c.xml (deflated 94%) 2025-08-14T23:00:20.6232278Z adding: test/test-reports/python-pytest/inductor.test_kernel_benchmark/inductor.test_kernel_benchmark-b692c76211542f55.xml (deflated 28%) 2025-08-14T23:00:20.6233395Z adding: test/test-reports/python-pytest/inductor.test_kernel_benchmark/inductor.test_kernel_benchmark-e0707ad50870b837.xml (deflated 90%) 2025-08-14T23:00:20.6234426Z adding: test/test-reports/python-pytest/inductor.test_cudagraph_trees/inductor.test_cudagraph_trees-b59fa59f91059cb2.xml (deflated 28%) 2025-08-14T23:00:20.6238919Z adding: test/test-reports/python-pytest/inductor.test_cudagraph_trees/inductor.test_cudagraph_trees-e2744b261ff9743a.xml (deflated 95%) 2025-08-14T23:00:20.6239930Z adding: test/test-reports/python-pytest/dynamo.test_dynamic_shapes/dynamo.test_dynamic_shapes-7505778301a9bee5.xml (deflated 64%) 2025-08-14T23:00:20.6240897Z adding: test/test-reports/python-pytest/dynamo.test_dynamic_shapes/dynamo.test_dynamic_shapes-9637c65aa9646160.xml (deflated 28%) 2025-08-14T23:00:20.6268654Z adding: test/test-reports/python-pytest/dynamo.test_dynamic_shapes/dynamo.test_dynamic_shapes-265e3e1201c1dc13.xml (deflated 95%) 2025-08-14T23:00:20.6296080Z adding: test/test-reports/python-pytest/dynamo.test_dynamic_shapes/dynamo.test_dynamic_shapes-c5a69923a663cacc.xml (deflated 96%) 2025-08-14T23:00:20.6297017Z adding: test/test-reports/python-pytest/dynamo.test_unspec/dynamo.test_unspec-e9afad2af7ff06ef.xml (deflated 28%) 2025-08-14T23:00:20.6297958Z adding: test/test-reports/python-pytest/dynamo.test_unspec/dynamo.test_unspec-298ade709d871f59.xml (deflated 93%) 2025-08-14T23:00:20.6298857Z adding: test/test-reports/python-pytest/dynamo.test_repros/dynamo.test_repros-2ea139ddce5a0fc6.xml (deflated 62%) 2025-08-14T23:00:20.6308417Z adding: test/test-reports/python-pytest/dynamo.test_repros/dynamo.test_repros-e35a0cb439b4459b.xml (deflated 94%) 2025-08-14T23:00:20.6309368Z adding: test/test-reports/python-pytest/inductor.test_pattern_matcher/inductor.test_pattern_matcher-01f9b75f099a9b5c.xml (deflated 28%) 2025-08-14T23:00:20.6310431Z adding: test/test-reports/python-pytest/inductor.test_pattern_matcher/inductor.test_pattern_matcher-22c7353f7a394753.xml (deflated 93%) 2025-08-14T23:00:20.6311527Z adding: test/test-reports/python-pytest/inductor.test_mkldnn_pattern_matcher/inductor.test_mkldnn_pattern_matcher-56bdbdf13f97fbab.xml (deflated 28%) 2025-08-14T23:00:20.6318610Z adding: test/test-reports/python-pytest/inductor.test_mkldnn_pattern_matcher/inductor.test_mkldnn_pattern_matcher-69227a95c1e12d49.xml (deflated 98%) 2025-08-14T23:00:20.6319719Z adding: test/test-reports/python-pytest/inductor.test_extension_backend/inductor.test_extension_backend-f70cdedb26f3b31f.xml (deflated 28%) 2025-08-14T23:00:20.6320783Z adding: test/test-reports/python-pytest/inductor.test_extension_backend/inductor.test_extension_backend-88445701cae2d9a8.xml (deflated 58%) 2025-08-14T23:00:20.6321753Z adding: test/test-reports/python-pytest/inductor.test_perf/inductor.test_perf-bc1fbddd255ef323.xml (deflated 28%) 2025-08-14T23:00:20.6322631Z adding: test/test-reports/python-pytest/inductor.test_perf/inductor.test_perf-1e21db308a422365.xml (deflated 95%) 2025-08-14T23:00:20.6323520Z adding: test/test-reports/python-pytest/inductor.test_foreach/inductor.test_foreach-b10955e43affa060.xml (deflated 28%) 2025-08-14T23:00:20.6334127Z adding: test/test-reports/python-pytest/inductor.test_foreach/inductor.test_foreach-afd2b5f5797e5cc4.xml (deflated 98%) 2025-08-14T23:00:20.6335040Z adding: test/test-reports/python-pytest/dynamo.test_modules/dynamo.test_modules-d3724171c853b828.xml (deflated 28%) 2025-08-14T23:00:20.6338099Z adding: test/test-reports/python-pytest/dynamo.test_modules/dynamo.test_modules-c6cedebe2adb3bbf.xml (deflated 96%) 2025-08-14T23:00:20.6339065Z adding: test/test-reports/python-pytest/inductor.test_triton_wrapper/inductor.test_triton_wrapper-313a2242f74e7db0.xml (deflated 28%) 2025-08-14T23:00:20.6340084Z adding: test/test-reports/python-pytest/inductor.test_triton_wrapper/inductor.test_triton_wrapper-110489bfe9bc88f4.xml (deflated 45%) 2025-08-14T23:00:20.6341390Z adding: test/test-reports/python-pytest/inductor.test_coordinate_descent_tuner/inductor.test_coordinate_descent_tuner-7cc227bb0131f4b8.xml (deflated 28%) 2025-08-14T23:00:20.6342563Z adding: test/test-reports/python-pytest/inductor.test_coordinate_descent_tuner/inductor.test_coordinate_descent_tuner-85f1f17c5511fcc0.xml (deflated 82%) 2025-08-14T23:00:20.6343674Z adding: test/test-reports/python-pytest/inductor.test_select_algorithm/inductor.test_select_algorithm-5f3f7fd1eed72873.xml (deflated 28%) 2025-08-14T23:00:20.6344731Z adding: test/test-reports/python-pytest/inductor.test_select_algorithm/inductor.test_select_algorithm-32fc23b71e073928.xml (deflated 93%) 2025-08-14T23:00:20.6345761Z adding: test/test-reports/python-pytest/inductor.test_dependencies/inductor.test_dependencies-364f765af7763f1a.xml (deflated 27%) 2025-08-14T23:00:20.6346764Z adding: test/test-reports/python-pytest/inductor.test_dependencies/inductor.test_dependencies-6ffb098c7cae737a.xml (deflated 82%) 2025-08-14T23:00:20.6348022Z adding: test/test-reports/python-pytest/inductor.test_compiled_autograd/inductor.test_compiled_autograd-e2f811e6fa81b773.xml (deflated 28%) 2025-08-14T23:00:20.6356597Z adding: test/test-reports/python-pytest/inductor.test_compiled_autograd/inductor.test_compiled_autograd-f59679583618611f.xml (deflated 95%) 2025-08-14T23:00:20.6357711Z adding: test/test-reports/python-pytest/inductor.test_triton_extension_backend/inductor.test_triton_extension_backend-975796eee94a08b5.xml (deflated 28%) 2025-08-14T23:00:20.6358866Z adding: test/test-reports/python-pytest/inductor.test_triton_extension_backend/inductor.test_triton_extension_backend-eeba7c45032f1eeb.xml (deflated 28%) 2025-08-14T23:00:20.6360054Z adding: test/test-reports/python-pytest/inductor.test_autoheuristic/inductor.test_autoheuristic-ad2e122c83f11b3f.xml (deflated 28%) 2025-08-14T23:00:20.6361090Z adding: test/test-reports/python-pytest/inductor.test_autoheuristic/inductor.test_autoheuristic-58ac0dca85265ed8.xml (deflated 28%) 2025-08-14T23:00:20.6362123Z adding: test/test-reports/python-pytest/dynamo.test_precompile_context/dynamo.test_precompile_context-0713064559078796.xml (deflated 28%) 2025-08-14T23:00:20.6363164Z adding: test/test-reports/python-pytest/dynamo.test_precompile_context/dynamo.test_precompile_context-15d1b363d7f7ac89.xml (deflated 74%) 2025-08-14T23:00:20.6364293Z adding: test/test-reports/python-pytest/dynamo.test_cudagraphs_expandable_segments/dynamo.test_cudagraphs_expandable_segments-3c507de202644905.xml (deflated 28%) 2025-08-14T23:00:20.6365591Z adding: test/test-reports/python-pytest/dynamo.test_cudagraphs_expandable_segments/dynamo.test_cudagraphs_expandable_segments-082aae9fcf4c890d.xml (deflated 87%) 2025-08-14T23:00:20.6366716Z adding: test/test-reports/python-pytest/dynamo.test_guard_manager/dynamo.test_guard_manager-6b48c5825446e72e.xml (deflated 28%) 2025-08-14T23:00:20.6367698Z adding: test/test-reports/python-pytest/dynamo.test_guard_manager/dynamo.test_guard_manager-147970608ebc6e49.xml (deflated 93%) 2025-08-14T23:00:20.6368682Z adding: test/test-reports/python-pytest/test_sparse_semi_structured/test_sparse_semi_structured-acc9dac84809e0c8.xml (deflated 28%) 2025-08-14T23:00:20.6369666Z adding: test/test-reports/python-pytest/test_sparse_semi_structured/test_sparse_semi_structured-06d381c97e706a8f.xml (deflated 94%) 2025-08-14T23:00:20.6370606Z adding: test/test-reports/python-pytest/export.test_serialize/export.test_serialize-8e17f4eafc216c62.xml (deflated 28%) 2025-08-14T23:00:20.6371522Z adding: test/test-reports/python-pytest/export.test_serialize/export.test_serialize-4489f4083abecfbc.xml (deflated 95%) 2025-08-14T23:00:20.6372478Z adding: test/test-reports/python-pytest/dynamo.test_structured_trace/dynamo.test_structured_trace-86316e798e174930.xml (deflated 28%) 2025-08-14T23:00:20.6373485Z adding: test/test-reports/python-pytest/dynamo.test_structured_trace/dynamo.test_structured_trace-3af5edc47afeaf42.xml (deflated 91%) 2025-08-14T23:00:20.6374561Z adding: test/test-reports/python-pytest/export.test_verifier/export.test_verifier-0cf23214fa3d1f56.xml (deflated 28%) 2025-08-14T23:00:20.6375522Z adding: test/test-reports/python-pytest/export.test_verifier/export.test_verifier-a7e819c681d979e8.xml (deflated 88%) 2025-08-14T23:00:20.6376441Z adding: test/test-reports/python-pytest/dynamo.test_recompile_ux/dynamo.test_recompile_ux-f0ca8944bb26f955.xml (deflated 28%) 2025-08-14T23:00:20.6377437Z adding: test/test-reports/python-pytest/dynamo.test_recompile_ux/dynamo.test_recompile_ux-bcd88b17873abd60.xml (deflated 87%) 2025-08-14T23:00:20.6378348Z adding: test/test-reports/python-pytest/dynamo.test_minifier/dynamo.test_minifier-6abee325b4b77885.xml (deflated 28%) 2025-08-14T23:00:20.6379236Z adding: test/test-reports/python-pytest/dynamo.test_minifier/dynamo.test_minifier-9c14e454970e90fb.xml (deflated 91%) 2025-08-14T23:00:20.6380359Z adding: test/test-reports/python-pytest/inductor.test_cudagraph_trees_expandable_segments/inductor.test_cudagraph_trees_expandable_segments-ebbb13fb34fa97c2.xml (deflated 28%) 2025-08-14T23:00:20.6381710Z adding: test/test-reports/python-pytest/inductor.test_cudagraph_trees_expandable_segments/inductor.test_cudagraph_trees_expandable_segments-67b2d976e9645e7f.xml (deflated 95%) 2025-08-14T23:00:20.6382917Z adding: test/test-reports/python-pytest/test_functionalization_of_rng_ops/test_functionalization_of_rng_ops-8f82d7e68ec864f7.xml (deflated 28%) 2025-08-14T23:00:20.6383990Z adding: test/test-reports/python-pytest/test_functionalization_of_rng_ops/test_functionalization_of_rng_ops-ce8e865264273634.xml (deflated 89%) 2025-08-14T23:00:20.6384954Z adding: test/test-reports/python-pytest/dynamo.test_hooks/dynamo.test_hooks-0c1ae735859903b7.xml (deflated 28%) 2025-08-14T23:00:20.6385846Z adding: test/test-reports/python-pytest/dynamo.test_hooks/dynamo.test_hooks-fe85c93cfd7c97d7.xml (deflated 93%) 2025-08-14T23:00:20.6386725Z adding: test/test-reports/python-pytest/dynamo.test_generator/dynamo.test_generator-998ee05a46928d67.xml (deflated 28%) 2025-08-14T23:00:20.6387632Z adding: test/test-reports/python-pytest/dynamo.test_generator/dynamo.test_generator-b95326b3ef10999b.xml (deflated 95%) 2025-08-14T23:00:20.6388555Z adding: test/test-reports/python-pytest/dynamo.test_reorder_logs/dynamo.test_reorder_logs-55d9a1fc314640e4.xml (deflated 28%) 2025-08-14T23:00:20.6389498Z adding: test/test-reports/python-pytest/dynamo.test_reorder_logs/dynamo.test_reorder_logs-fe206cb81d473bde.xml (deflated 91%) 2025-08-14T23:00:20.6404112Z adding: test/test-reports/python-pytest/dynamo.test_subclasses/dynamo.test_subclasses-59c3d2c19f23bda7.xml (deflated 28%) 2025-08-14T23:00:20.6405159Z adding: test/test-reports/python-pytest/dynamo.test_subclasses/dynamo.test_subclasses-244b1da4e988347e.xml (deflated 95%) 2025-08-14T23:00:20.6406068Z adding: test/test-reports/python-pytest/export.test_sparse/export.test_sparse-e61bb788e99a3eb9.xml (deflated 28%) 2025-08-14T23:00:20.6406948Z adding: test/test-reports/python-pytest/export.test_sparse/export.test_sparse-f1b94e8e05abc7a1.xml (deflated 98%) 2025-08-14T23:00:20.6407920Z adding: test/test-reports/python-pytest/functorch.test_eager_transforms/functorch.test_eager_transforms-e03c70b983538261.xml (deflated 28%) 2025-08-14T23:00:20.6408989Z adding: test/test-reports/python-pytest/functorch.test_eager_transforms/functorch.test_eager_transforms-6f89715a86540153.xml (deflated 96%) 2025-08-14T23:00:20.6409992Z adding: test/test-reports/python-pytest/export.test_draft_export/export.test_draft_export-7638dc4c42005c36.xml (deflated 28%) 2025-08-14T23:00:20.6410935Z adding: test/test-reports/python-pytest/export.test_draft_export/export.test_draft_export-71c2586372a0d0f4.xml (deflated 91%) 2025-08-14T23:00:20.6411855Z adding: test/test-reports/python-pytest/dynamo.test_comptime/dynamo.test_comptime-5b6a902d0911fac7.xml (deflated 28%) 2025-08-14T23:00:20.6412739Z adding: test/test-reports/python-pytest/dynamo.test_comptime/dynamo.test_comptime-1ba0ab1746ba8302.xml (deflated 90%) 2025-08-14T23:00:20.6413897Z adding: test/test-reports/python-pytest/dynamo.test_python_autograd/dynamo.test_python_autograd-d81cb0019977f741.xml (deflated 28%) 2025-08-14T23:00:20.6414882Z adding: test/test-reports/python-pytest/dynamo.test_python_autograd/dynamo.test_python_autograd-d90f45447396cd02.xml (deflated 82%) 2025-08-14T23:00:20.6415813Z adding: test/test-reports/python-pytest/test_quantization/test_quantization-1bab16ffccb994c8.xml (deflated 27%) 2025-08-14T23:00:20.6497857Z adding: test/test-reports/python-pytest/test_quantization/test_quantization-f1ac158da761fb0a.xml (deflated 99%) 2025-08-14T23:00:20.6498738Z adding: test/test-reports/python-pytest/test_dynamic_shapes/test_dynamic_shapes-eca6f3f6e395d58f.xml (deflated 28%) 2025-08-14T23:00:20.6505831Z adding: test/test-reports/python-pytest/test_dynamic_shapes/test_dynamic_shapes-fc9e07303c0f1cf9.xml (deflated 97%) 2025-08-14T23:00:20.6506910Z adding: test/test-reports/python-pytest/higher_order_ops.test_invoke_subgraph/higher_order_ops.test_invoke_subgraph-d6214308b8d6ae12.xml (deflated 28%) 2025-08-14T23:00:20.6508184Z adding: test/test-reports/python-pytest/higher_order_ops.test_invoke_subgraph/higher_order_ops.test_invoke_subgraph-6059f5d69005dcc1.xml (deflated 95%) 2025-08-14T23:00:20.6509121Z adding: test/test-reports/python-pytest/test_package/test_package-ac6045080c22288d.xml (deflated 28%) 2025-08-14T23:00:20.6512928Z adding: test/test-reports/python-pytest/test_package/test_package-bea90ce2e7035373.xml (deflated 94%) 2025-08-14T23:00:20.6513805Z adding: test/test-reports/python-pytest/functorch.test_rearrange/functorch.test_rearrange-cff18efce60be7ca.xml (deflated 28%) 2025-08-14T23:00:20.6514905Z adding: test/test-reports/python-pytest/functorch.test_rearrange/functorch.test_rearrange-3b0ef1c32fa5ece9.xml (deflated 88%) 2025-08-14T23:00:20.6515858Z adding: test/test-reports/python-pytest/functorch.test_parsing/functorch.test_parsing-0bbfd8f8ee611928.xml (deflated 28%) 2025-08-14T23:00:20.6516795Z adding: test/test-reports/python-pytest/functorch.test_parsing/functorch.test_parsing-f9aac4554d0f2063.xml (deflated 88%) 2025-08-14T23:00:20.6517665Z adding: test/test-reports/python-pytest/test_hop_infra/test_hop_infra-a0836d603aab2395.xml (deflated 28%) 2025-08-14T23:00:20.6518449Z adding: test/test-reports/python-pytest/test_hop_infra/test_hop_infra-5da1394dbaf35593.xml (deflated 72%) 2025-08-14T23:00:20.6519295Z adding: test/test-reports/python-pytest/test_comparison_utils/test_comparison_utils-2c9c1ebf332c5f54.xml (deflated 28%) 2025-08-14T23:00:20.6520191Z adding: test/test-reports/python-pytest/test_comparison_utils/test_comparison_utils-789cf5807a37f93e.xml (deflated 86%) 2025-08-14T23:00:20.6521144Z adding: test/test-reports/python-pytest/functorch.test_control_flow/functorch.test_control_flow-be6593c64a5ff5d1.xml (deflated 28%) 2025-08-14T23:00:20.6549938Z adding: test/test-reports/python-pytest/functorch.test_control_flow/functorch.test_control_flow-57526baee8705f79.xml (deflated 98%) 2025-08-14T23:00:20.6550944Z adding: test/test-reports/python-pytest/functorch.test_ac_logging/functorch.test_ac_logging-4a4405c4736275f4.xml (deflated 28%) 2025-08-14T23:00:20.6551911Z adding: test/test-reports/python-pytest/functorch.test_ac_logging/functorch.test_ac_logging-be41a196cade2c7f.xml (deflated 63%) 2025-08-14T23:00:20.6552789Z adding: test/test-reports/python-pytest/test_mkl_verbose/test_mkl_verbose-5304d9ffa0216a6d.xml (deflated 28%) 2025-08-14T23:00:20.6553613Z adding: test/test-reports/python-pytest/test_mkl_verbose/test_mkl_verbose-6579fa56e8b470d0.xml (deflated 64%) 2025-08-14T23:00:20.6554541Z adding: test/test-reports/python-pytest/test_appending_byte_serializer/test_appending_byte_serializer-d5bf1cacf9b4a6bf.xml (deflated 28%) 2025-08-14T23:00:20.6555578Z adding: test/test-reports/python-pytest/test_appending_byte_serializer/test_appending_byte_serializer-e674d5f518ad5535.xml (deflated 74%) 2025-08-14T23:00:20.6556654Z adding: test/test-reports/python-pytest/test_autoload/test_autoload-8dbb4ec3c04c6b28.xml (deflated 28%) 2025-08-14T23:00:20.6557535Z adding: test/test-reports/python-pytest/test_autoload/test_autoload-501d57cf58477113.xml (deflated 45%) 2025-08-14T23:00:20.6558350Z adding: test/test-reports/python-pytest/test_proxy_tensor/test_proxy_tensor-3efa32868a9c21ca.xml (deflated 28%) 2025-08-14T23:00:20.6559188Z adding: test/test-reports/python-pytest/test_proxy_tensor/test_proxy_tensor-f9653ceb6a010f18.xml (deflated 96%) 2025-08-14T23:00:20.6560032Z adding: test/test-reports/python-pytest/test_mkldnn_verbose/test_mkldnn_verbose-ab15ec312539c501.xml (deflated 28%) 2025-08-14T23:00:20.6560895Z adding: test/test-reports/python-pytest/test_mkldnn_verbose/test_mkldnn_verbose-fd0ab0180b73f6e9.xml (deflated 64%) 2025-08-14T23:00:20.6561801Z adding: test/test-reports/python-pytest/test_utils_config_module/test_utils_config_module-5f47b208bedbeeeb.xml (deflated 28%) 2025-08-14T23:00:20.6562747Z adding: test/test-reports/python-pytest/test_utils_config_module/test_utils_config_module-065604516d010482.xml (deflated 93%) 2025-08-14T23:00:20.6563676Z adding: test/test-reports/python-pytest/test_functionalization/test_functionalization-bc457c41d572125d.xml (deflated 28%) 2025-08-14T23:00:20.6564690Z adding: test/test-reports/python-pytest/test_functionalization/test_functionalization-c640921613fdda3b.xml (deflated 96%) 2025-08-14T23:00:20.6565768Z adding: test/test-reports/python-pytest/test_rename_privateuse1_to_existing_device/test_rename_privateuse1_to_existing_device-67d9295fb34cb762.xml (deflated 28%) 2025-08-14T23:00:20.6567023Z adding: test/test-reports/python-pytest/test_rename_privateuse1_to_existing_device/test_rename_privateuse1_to_existing_device-4f24fe9eec51bd0c.xml (deflated 46%) 2025-08-14T23:00:20.6568126Z adding: test/test-reports/python-pytest/test_transformers/test_transformers-237f7f952edbff66.xml (deflated 28%) 2025-08-14T23:00:20.6868933Z adding: test/test-reports/python-pytest/test_transformers/test_transformers-f14224c26c40abc3.xml (deflated 99%) 2025-08-14T23:00:20.6869821Z adding: test/test-reports/python-pytest/test_ao_sparsity/test_ao_sparsity-81dfe510d65d7b25.xml (deflated 28%) 2025-08-14T23:00:20.6872073Z adding: test/test-reports/python-pytest/test_ao_sparsity/test_ao_sparsity-9f52dc4ae5c9a917.xml (deflated 94%) 2025-08-14T23:00:20.6872901Z adding: test/test-reports/python-pytest/test_license/test_license-b1a0a5ce7f0994fe.xml (deflated 28%) 2025-08-14T23:00:20.6873672Z adding: test/test-reports/python-pytest/test_license/test_license-2ff3623700cf3f8d.xml (deflated 57%) 2025-08-14T23:00:20.6874552Z adding: test/test-reports/python-pytest/torch_np.test_binary_ufuncs/torch_np.test_binary_ufuncs-4a1224bb9b524dfe.xml (deflated 27%) 2025-08-14T23:00:20.6875531Z adding: test/test-reports/python-pytest/torch_np.test_binary_ufuncs/torch_np.test_binary_ufuncs-4fc15291270a0498.xml (deflated 95%) 2025-08-14T23:00:20.6876507Z adding: test/test-reports/python-pytest/torch_np.test_unary_ufuncs/torch_np.test_unary_ufuncs-db48a74a8f5ca981.xml (deflated 28%) 2025-08-14T23:00:20.6877473Z adding: test/test-reports/python-pytest/torch_np.test_unary_ufuncs/torch_np.test_unary_ufuncs-667d3e8b59bd4809.xml (deflated 95%) 2025-08-14T23:00:20.6878332Z adding: test/test-reports/python-pytest/test_foreach/test_foreach-a06bec2725b42a3a.xml (deflated 28%) 2025-08-14T23:00:20.6959594Z adding: test/test-reports/python-pytest/test_foreach/test_foreach-dfc2bf37ef417995.xml (deflated 99%) 2025-08-14T23:00:20.6960456Z adding: test/test-reports/python-pytest/test_expanded_weights/test_expanded_weights-b4dc5407b532a40a.xml (deflated 28%) 2025-08-14T23:00:20.6965825Z adding: test/test-reports/python-pytest/test_expanded_weights/test_expanded_weights-ebeacb259f2d44d1.xml (deflated 98%) 2025-08-14T23:00:20.6966844Z adding: test/test-reports/python-pytest/typing.test_python_operators/typing.test_python_operators-f0950f5a19ce66da.xml (deflated 28%) 2025-08-14T23:00:20.6975370Z adding: test/test-reports/python-pytest/typing.test_python_operators/typing.test_python_operators-82203373b385b0e6.xml (deflated 98%) 2025-08-14T23:00:20.6976418Z adding: test/test-reports/python-pytest/test_fx_experimental/test_fx_experimental-e13d6605bae85048.xml (deflated 28%) 2025-08-14T23:00:20.6992550Z adding: test/test-reports/python-pytest/test_fx_experimental/test_fx_experimental-a7d185886b9a9442.xml (deflated 98%) 2025-08-14T23:00:20.6993424Z adding: test/test-reports/python-pytest/test_file_check/test_file_check-41c289a705473241.xml (deflated 28%) 2025-08-14T23:00:20.6994218Z adding: test/test-reports/python-pytest/test_file_check/test_file_check-b7dfde3842157c48.xml (deflated 63%) 2025-08-14T23:00:20.6995061Z adding: test/test-reports/python-pytest/torch_np.test_dtype/torch_np.test_dtype-bf3781c269371f45.xml (deflated 28%) 2025-08-14T23:00:20.6995927Z adding: test/test-reports/python-pytest/torch_np.test_dtype/torch_np.test_dtype-368065dfa65c33a6.xml (deflated 96%) 2025-08-14T23:00:20.6996962Z adding: test/test-reports/python-pytest/higher_order_ops.test_with_effects/higher_order_ops.test_with_effects-0b183f65ebab30c0.xml (deflated 28%) 2025-08-14T23:00:20.6998034Z adding: test/test-reports/python-pytest/higher_order_ops.test_with_effects/higher_order_ops.test_with_effects-ab1316909328be06.xml (deflated 84%) 2025-08-14T23:00:20.6999028Z adding: test/test-reports/python-pytest/test_native_functions/test_native_functions-8df2270b62db7260.xml (deflated 28%) 2025-08-14T23:00:20.6999944Z adding: test/test-reports/python-pytest/test_native_functions/test_native_functions-5a54e4418c22a601.xml (deflated 89%) 2025-08-14T23:00:20.7000881Z adding: test/test-reports/python-pytest/functorch.test_minifier/functorch.test_minifier-b10fed7513a26f1a.xml (deflated 28%) 2025-08-14T23:00:20.7001917Z adding: test/test-reports/python-pytest/functorch.test_minifier/functorch.test_minifier-28990ca4c4daa7d3.xml (deflated 81%) 2025-08-14T23:00:20.7002801Z adding: test/test-reports/python-pytest/test_flop_counter/test_flop_counter-ed83b5284150fc45.xml (deflated 28%) 2025-08-14T23:00:20.7003636Z adding: test/test-reports/python-pytest/test_flop_counter/test_flop_counter-b59490b3f686a861.xml (deflated 92%) 2025-08-14T23:00:20.7004647Z adding: test/test-reports/python-pytest/torch_np.test_nep50_examples/torch_np.test_nep50_examples-e1ccedad9af8faf6.xml (deflated 28%) 2025-08-14T23:00:20.7031649Z adding: test/test-reports/python-pytest/torch_np.test_nep50_examples/torch_np.test_nep50_examples-7de2f0f0ee76d7cc.xml (deflated 99%) 2025-08-14T23:00:20.7032699Z adding: test/test-reports/python-pytest/higher_order_ops.test_invoke_quant/higher_order_ops.test_invoke_quant-207c21cf9f1782ca.xml (deflated 28%) 2025-08-14T23:00:20.7033778Z adding: test/test-reports/python-pytest/higher_order_ops.test_invoke_quant/higher_order_ops.test_invoke_quant-5169c67f608b9a0b.xml (deflated 92%) 2025-08-14T23:00:20.7034775Z adding: test/test-reports/python-pytest/test_per_overload_api/test_per_overload_api-574e9a0a7fd742fd.xml (deflated 28%) 2025-08-14T23:00:20.7035667Z adding: test/test-reports/python-pytest/test_per_overload_api/test_per_overload_api-3a8eda95778ca505.xml (deflated 73%) 2025-08-14T23:00:20.7036582Z adding: test/test-reports/python-pytest/test_tensorexpr_pybind/test_tensorexpr_pybind-1f13a799c05b815e.xml (deflated 28%) 2025-08-14T23:00:20.7037549Z adding: test/test-reports/python-pytest/test_tensorexpr_pybind/test_tensorexpr_pybind-c9887ffd8ef479eb.xml (deflated 90%) 2025-08-14T23:00:20.7038395Z adding: test/test-reports/python-pytest/test_pytree/test_pytree-5a1cba5baaa35504.xml (deflated 28%) 2025-08-14T23:00:20.7039146Z adding: test/test-reports/python-pytest/test_pytree/test_pytree-5161ae636053a4e0.xml (deflated 96%) 2025-08-14T23:00:20.7039911Z adding: test/test-reports/python-pytest/test_fx_passes/test_fx_passes-4d72b7230e6c59b9.xml (deflated 28%) 2025-08-14T23:00:20.7040701Z adding: test/test-reports/python-pytest/test_fx_passes/test_fx_passes-7a303209b13379b2.xml (deflated 96%) 2025-08-14T23:00:20.7041628Z adding: test/test-reports/python-pytest/test_module_tracker/test_module_tracker-304218aca0368b0f.xml (deflated 28%) 2025-08-14T23:00:20.7042551Z adding: test/test-reports/python-pytest/test_module_tracker/test_module_tracker-b7d4d6e104878113.xml (deflated 72%) 2025-08-14T23:00:20.7043372Z adding: test/test-reports/python-pytest/test_typing/test_typing-2c9e89eca0bb782d.xml (deflated 27%) 2025-08-14T23:00:20.7044120Z adding: test/test-reports/python-pytest/test_typing/test_typing-5badb422a2b5c93f.xml (deflated 91%) 2025-08-14T23:00:20.7044941Z adding: test/test-reports/python-pytest/test_openmp/test_openmp-7c17196140561020.xml (deflated 28%) 2025-08-14T23:00:20.7045683Z adding: test/test-reports/python-pytest/test_openmp/test_openmp-d6d563d2cb799cc0.xml (deflated 63%) 2025-08-14T23:00:20.7046680Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_scalarinherit/torch_np.numpy_tests.core.test_scalarinherit-5b20230047208bc3.xml (deflated 28%) 2025-08-14T23:00:20.7048183Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_scalarinherit/torch_np.numpy_tests.core.test_scalarinherit-66f318480e28ac23.xml (deflated 76%) 2025-08-14T23:00:20.7049308Z adding: test/test-reports/python-pytest/functorch.test_ac_knapsack/functorch.test_ac_knapsack-eda7ad7d4415a30f.xml (deflated 27%) 2025-08-14T23:00:20.7050294Z adding: test/test-reports/python-pytest/functorch.test_ac_knapsack/functorch.test_ac_knapsack-6e302404f97f3c4a.xml (deflated 89%) 2025-08-14T23:00:20.7051269Z adding: test/test-reports/python-pytest/backends.xeon.test_launch/backends.xeon.test_launch-029b854461595926.xml (deflated 28%) 2025-08-14T23:00:20.7052229Z adding: test/test-reports/python-pytest/backends.xeon.test_launch/backends.xeon.test_launch-19b73bbae962d277.xml (deflated 49%) 2025-08-14T23:00:20.7053234Z adding: test/test-reports/python-pytest/test_nestedtensor/test_nestedtensor-ebdeef2dd0c5465c.xml (deflated 43%) 2025-08-14T23:00:20.7086519Z adding: test/test-reports/python-pytest/test_nestedtensor/test_nestedtensor-080565ca83819542.xml (deflated 98%) 2025-08-14T23:00:20.7087392Z adding: test/test-reports/python-pytest/test_namedtensor/test_namedtensor-0c063673270ce61f.xml (deflated 28%) 2025-08-14T23:00:20.7089251Z adding: test/test-reports/python-pytest/test_namedtensor/test_namedtensor-8062496d290ea6f6.xml (deflated 95%) 2025-08-14T23:00:20.7090067Z adding: test/test-reports/python-pytest/xpu.test_gemm/xpu.test_gemm-e2ed9471c4868a14.xml (deflated 28%) 2025-08-14T23:00:20.7090823Z adding: test/test-reports/python-pytest/xpu.test_gemm/xpu.test_gemm-094e0eda0e528503.xml (deflated 28%) 2025-08-14T23:00:20.7091682Z adding: test/test-reports/python-pytest/torch_np.test_ufuncs_basic/torch_np.test_ufuncs_basic-e9e6ae16947b4352.xml (deflated 28%) 2025-08-14T23:00:20.7099076Z adding: test/test-reports/python-pytest/torch_np.test_ufuncs_basic/torch_np.test_ufuncs_basic-62a24d3863df2ac4.xml (deflated 98%) 2025-08-14T23:00:20.7099907Z adding: test/test-reports/python-pytest/test_meta/test_meta-e34e4ea36fa6397f.xml (deflated 28%) 2025-08-14T23:00:20.7991947Z adding: test/test-reports/python-pytest/test_meta/test_meta-f97c5aa66ca5dc6c.xml (deflated 99%) 2025-08-14T23:00:20.7992754Z adding: test/test-reports/python-pytest/test_utils_filelock/test_utils_filelock-98afaf492ba698b7.xml (deflated 28%) 2025-08-14T23:00:20.7993605Z adding: test/test-reports/python-pytest/test_utils_filelock/test_utils_filelock-85c047d67cc40026.xml (deflated 63%) 2025-08-14T23:00:20.7994457Z adding: test/test-reports/python-pytest/torch_np.test_random/torch_np.test_random-3d007123f05d075b.xml (deflated 28%) 2025-08-14T23:00:20.7995324Z adding: test/test-reports/python-pytest/torch_np.test_random/torch_np.test_random-1e14cc2f4fbdf8c1.xml (deflated 96%) 2025-08-14T23:00:20.7996208Z adding: test/test-reports/python-pytest/profiler.test_kineto/profiler.test_kineto-c79244f41cb7c308.xml (deflated 28%) 2025-08-14T23:00:20.7997094Z adding: test/test-reports/python-pytest/profiler.test_kineto/profiler.test_kineto-8e7a8c206c861396.xml (deflated 57%) 2025-08-14T23:00:20.7998194Z adding: test/test-reports/python-pytest/test_compile_benchmark_util/test_compile_benchmark_util-a4936a6a73b4975b.xml (deflated 28%) 2025-08-14T23:00:20.7999232Z adding: test/test-reports/python-pytest/test_compile_benchmark_util/test_compile_benchmark_util-b10951bae2993495.xml (deflated 44%) 2025-08-14T23:00:20.8000105Z adding: test/test-reports/python-pytest/test_jiterator/test_jiterator-fddae1b7e924224b.xml (deflated 28%) 2025-08-14T23:00:20.8004008Z adding: test/test-reports/python-pytest/test_jiterator/test_jiterator-78939c2a2deb7d22.xml (deflated 98%) 2025-08-14T23:00:20.8004816Z adding: test/test-reports/python-pytest/test_optim/test_optim-7a617cb740160910.xml (deflated 28%) 2025-08-14T23:00:20.8025387Z adding: test/test-reports/python-pytest/test_optim/test_optim-1c57e0820905b362.xml (deflated 98%) 2025-08-14T23:00:20.8026130Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-d2b7c4be1ccdebd9.xml (deflated 28%) 2025-08-14T23:00:20.8026866Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-d30890b7bdc53715.xml (deflated 28%) 2025-08-14T23:00:20.8027616Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-f13a1aa9d7440d5e.xml (deflated 28%) 2025-08-14T23:00:20.8028352Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-2b587696df92086c.xml (deflated 28%) 2025-08-14T23:00:20.8029086Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-db9cf8c8b4f5042a.xml (deflated 28%) 2025-08-14T23:00:20.8029822Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-48cbb8cd11a290d9.xml (deflated 27%) 2025-08-14T23:00:20.8041717Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-d38fa9f5d1d4af0f.xml (deflated 97%) 2025-08-14T23:00:20.8056637Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-4593e62c69cc8780.xml (deflated 98%) 2025-08-14T23:00:20.8071416Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-a18c0b5590d1d068.xml (deflated 97%) 2025-08-14T23:00:20.8086682Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-efa9afeb009c7343.xml (deflated 98%) 2025-08-14T23:00:20.8100384Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-06e35b148ad2ae3a.xml (deflated 97%) 2025-08-14T23:00:20.8115272Z adding: test/test-reports/python-pytest/test_decomp/test_decomp-270000ea779c6897.xml (deflated 97%) 2025-08-14T23:00:20.8116064Z adding: test/test-reports/python-pytest/test_binary_ufuncs/test_binary_ufuncs-266d254e2ecd989b.xml (deflated 28%) 2025-08-14T23:00:20.8415222Z adding: test/test-reports/python-pytest/test_binary_ufuncs/test_binary_ufuncs-6fec35f1dd438563.xml (deflated 99%) 2025-08-14T23:00:20.8416046Z adding: test/test-reports/python-pytest/test_matmul_cuda/test_matmul_cuda-cf3547e81f7916b3.xml (deflated 28%) 2025-08-14T23:00:20.8452381Z adding: test/test-reports/python-pytest/test_matmul_cuda/test_matmul_cuda-b7e1c0a5a6b03283.xml (deflated 99%) 2025-08-14T23:00:20.8453912Z adding: test/test-reports/python-pytest/test_weak/test_weak-f574411498892646.xml (deflated 28%) 2025-08-14T23:00:20.8455303Z adding: test/test-reports/python-pytest/test_weak/test_weak-7e76390db664afb4.xml (deflated 93%) 2025-08-14T23:00:20.8456722Z adding: test/test-reports/python-pytest/functorch.test_logging/functorch.test_logging-efd74ff9ed0cb22e.xml (deflated 28%) 2025-08-14T23:00:20.8457657Z adding: test/test-reports/python-pytest/functorch.test_logging/functorch.test_logging-b92491d178af51f3.xml (deflated 43%) 2025-08-14T23:00:20.8458495Z adding: test/test-reports/python-pytest/test_logging/test_logging-9859a36895c6aace.xml (deflated 28%) 2025-08-14T23:00:20.8459245Z adding: test/test-reports/python-pytest/test_logging/test_logging-e255c82df73141e5.xml (deflated 44%) 2025-08-14T23:00:20.8460015Z adding: test/test-reports/python-pytest/test_out_dtype_op/test_out_dtype_op-0c283e0cc4e362f8.xml (deflated 28%) 2025-08-14T23:00:20.8460815Z adding: test/test-reports/python-pytest/test_out_dtype_op/test_out_dtype_op-9a4e8c1a7af3d109.xml (deflated 89%) 2025-08-14T23:00:20.8461760Z adding: test/test-reports/python-pytest/test_complex/test_complex-01d45e38aa427368.xml (deflated 28%) 2025-08-14T23:00:20.8462570Z adding: test/test-reports/python-pytest/test_complex/test_complex-f50298af1c21141a.xml (deflated 92%) 2025-08-14T23:00:20.8463736Z adding: test/test-reports/python-pytest/cpp_extensions.python_agnostic_extension.test.test_python_agnostic/cpp_extensions.python_agnostic_extension.test.test_python_agnostic-0928afe67bececa6.xml (deflated 27%) 2025-08-14T23:00:20.8465303Z adding: test/test-reports/python-pytest/cpp_extensions.python_agnostic_extension.test.test_python_agnostic/cpp_extensions.python_agnostic_extension.test.test_python_agnostic-502f485b4d84c404.xml (deflated 51%) 2025-08-14T23:00:20.8466646Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_einsum/torch_np.numpy_tests.core.test_einsum-9d1fe9d456ffc399.xml (deflated 28%) 2025-08-14T23:00:20.8467756Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_einsum/torch_np.numpy_tests.core.test_einsum-5d4a44c48ebd580e.xml (deflated 95%) 2025-08-14T23:00:20.8468963Z adding: test/test-reports/python-pytest/test_ops_jit/test_ops_jit-cfbe1a0e17712f7c.xml (deflated 28%) 2025-08-14T23:00:20.8489642Z adding: test/test-reports/python-pytest/test_ops_jit/test_ops_jit-9a0e5deb38f23e2e.xml (deflated 98%) 2025-08-14T23:00:20.8490523Z adding: test/test-reports/python-pytest/lazy.test_step_closures/lazy.test_step_closures-61cbc398e0f2a177.xml (deflated 28%) 2025-08-14T23:00:20.8491449Z adding: test/test-reports/python-pytest/lazy.test_step_closures/lazy.test_step_closures-91fc4ec29012efd6.xml (deflated 79%) 2025-08-14T23:00:20.8492335Z adding: test/test-reports/python-pytest/test_ops_gradients/test_ops_gradients-f752071bf8daf085.xml (deflated 28%) 2025-08-14T23:00:20.8625712Z adding: test/test-reports/python-pytest/test_ops_gradients/test_ops_gradients-bb71a41efee619d9.xml (deflated 98%) 2025-08-14T23:00:20.8626616Z adding: test/test-reports/python-pytest/test_fx_reinplace_pass/test_fx_reinplace_pass-54c536484485ba20.xml (deflated 28%) 2025-08-14T23:00:20.8627511Z adding: test/test-reports/python-pytest/test_fx_reinplace_pass/test_fx_reinplace_pass-7dfe5f7cf0687f67.xml (deflated 90%) 2025-08-14T23:00:20.8628454Z adding: test/test-reports/python-pytest/nn.test_multihead_attention/nn.test_multihead_attention-dfa9132b35c6d770.xml (deflated 28%) 2025-08-14T23:00:20.8629424Z adding: test/test-reports/python-pytest/nn.test_multihead_attention/nn.test_multihead_attention-c92d7d691bbe5656.xml (deflated 92%) 2025-08-14T23:00:20.8630505Z adding: test/test-reports/python-pytest/functorch.test_aot_joint_with_descriptors/functorch.test_aot_joint_with_descriptors-9a7035125f22464b.xml (deflated 28%) 2025-08-14T23:00:20.8631702Z adding: test/test-reports/python-pytest/functorch.test_aot_joint_with_descriptors/functorch.test_aot_joint_with_descriptors-ca365a7499cefc37.xml (deflated 89%) 2025-08-14T23:00:20.8632697Z adding: test/test-reports/python-pytest/test_pruning_op/test_pruning_op-509361525793eb95.xml (deflated 28%) 2025-08-14T23:00:20.8633487Z adding: test/test-reports/python-pytest/test_pruning_op/test_pruning_op-1b99e97cab86ba86.xml (deflated 64%) 2025-08-14T23:00:20.8634388Z adding: test/test-reports/python-pytest/lazy.test_functionalization/lazy.test_functionalization-1ccf1dc2c7660b2d.xml (deflated 28%) 2025-08-14T23:00:20.8635410Z adding: test/test-reports/python-pytest/lazy.test_functionalization/lazy.test_functionalization-770f64d53c8d2c56.xml (deflated 64%) 2025-08-14T23:00:20.8636333Z adding: test/test-reports/python-pytest/functorch.test_ac/functorch.test_ac-2d7e59736e7cf25b.xml (deflated 28%) 2025-08-14T23:00:20.8637176Z adding: test/test-reports/python-pytest/functorch.test_ac/functorch.test_ac-868767058d75f3f2.xml (deflated 87%) 2025-08-14T23:00:20.8638034Z adding: test/test-reports/python-pytest/test_ops_fwd_gradients/test_ops_fwd_gradients-978578fd8e7e8f29.xml (deflated 28%) 2025-08-14T23:00:20.8712494Z adding: test/test-reports/python-pytest/test_ops_fwd_gradients/test_ops_fwd_gradients-f97c9f2979593017.xml (deflated 98%) 2025-08-14T23:00:20.8713376Z adding: test/test-reports/python-pytest/test_hub/test_hub-65032c03c8ceaf40.xml (deflated 28%) 2025-08-14T23:00:20.8714082Z adding: test/test-reports/python-pytest/test_hub/test_hub-53c29c0f0bf8e9b5.xml (deflated 91%) 2025-08-14T23:00:20.8714985Z adding: test/test-reports/python-pytest/test_set_default_mobile_cpu_allocator/test_set_default_mobile_cpu_allocator-d1c5b128887e3bc6.xml (deflated 27%) 2025-08-14T23:00:20.8716092Z adding: test/test-reports/python-pytest/test_set_default_mobile_cpu_allocator/test_set_default_mobile_cpu_allocator-cfba14ebb9d20403.xml (deflated 66%) 2025-08-14T23:00:20.8717158Z adding: test/test-reports/python-pytest/test_cuda_expandable_segments/test_cuda_expandable_segments-56d6205049ab82ce.xml (deflated 89%) 2025-08-14T23:00:20.8720467Z adding: test/test-reports/python-pytest/test_cuda_expandable_segments/test_cuda_expandable_segments-fe09d79894594e4e.xml (deflated 94%) 2025-08-14T23:00:20.8721415Z adding: test/test-reports/python-pytest/test_stateless/test_stateless-66bbd7799032127e.xml (deflated 28%) 2025-08-14T23:00:20.8722218Z adding: test/test-reports/python-pytest/test_stateless/test_stateless-2c3a62bd6c23149e.xml (deflated 95%) 2025-08-14T23:00:20.8723050Z adding: test/test-reports/python-pytest/test_numpy_interop/test_numpy_interop-3aba05ab97ecdabb.xml (deflated 28%) 2025-08-14T23:00:20.8723923Z adding: test/test-reports/python-pytest/test_numpy_interop/test_numpy_interop-ff2400680abf3894.xml (deflated 95%) 2025-08-14T23:00:20.8724921Z adding: test/test-reports/python-pytest/profiler.test_record_function/profiler.test_record_function-e16a101b9fe5cd14.xml (deflated 28%) 2025-08-14T23:00:20.8726036Z adding: test/test-reports/python-pytest/profiler.test_record_function/profiler.test_record_function-5497e8326aca9059.xml (deflated 79%) 2025-08-14T23:00:20.8726989Z adding: test/test-reports/python-pytest/test_type_hints/test_type_hints-b9ba20c8b0dd5a49.xml (deflated 28%) 2025-08-14T23:00:20.8727808Z adding: test/test-reports/python-pytest/test_type_hints/test_type_hints-9579aea05d2a5442.xml (deflated 58%) 2025-08-14T23:00:20.8728638Z adding: test/test-reports/python-pytest/nn.test_lazy_modules/nn.test_lazy_modules-37f49c6a4b30ee26.xml (deflated 28%) 2025-08-14T23:00:20.8729503Z adding: test/test-reports/python-pytest/nn.test_lazy_modules/nn.test_lazy_modules-1e16194efc6e2e1b.xml (deflated 96%) 2025-08-14T23:00:20.8730396Z adding: test/test-reports/python-pytest/test_segment_reductions/test_segment_reductions-99661aaebb40af02.xml (deflated 28%) 2025-08-14T23:00:20.8731338Z adding: test/test-reports/python-pytest/test_segment_reductions/test_segment_reductions-a0f29d4fda69393a.xml (deflated 97%) 2025-08-14T23:00:20.8732225Z adding: test/test-reports/python-pytest/test_import_stats/test_import_stats-98182126be6f6082.xml (deflated 28%) 2025-08-14T23:00:20.8733057Z adding: test/test-reports/python-pytest/test_import_stats/test_import_stats-95af276801db0faa.xml (deflated 63%) 2025-08-14T23:00:20.8733835Z adding: test/test-reports/python-pytest/test_masked/test_masked-672a699d4f534d8d.xml (deflated 28%) 2025-08-14T23:00:20.8736559Z adding: test/test-reports/python-pytest/test_masked/test_masked-ff1ec447616df592.xml (deflated 98%) 2025-08-14T23:00:20.8737441Z adding: test/test-reports/python-pytest/test_cuda_sanitizer/test_cuda_sanitizer-f0fc5213810fd756.xml (deflated 28%) 2025-08-14T23:00:20.8738308Z adding: test/test-reports/python-pytest/test_cuda_sanitizer/test_cuda_sanitizer-f32e588876d5a52f.xml (deflated 88%) 2025-08-14T23:00:20.8739122Z adding: test/test-reports/python-pytest/test_monitor/test_monitor-d57e0fd8924a8bf5.xml (deflated 28%) 2025-08-14T23:00:20.8739904Z adding: test/test-reports/python-pytest/test_monitor/test_monitor-03b2a391913152d1.xml (deflated 80%) 2025-08-14T23:00:20.8740754Z adding: test/test-reports/python-pytest/profiler.test_cpp_thread/profiler.test_cpp_thread-1cb1d80e3fc9b726.xml (deflated 27%) 2025-08-14T23:00:20.8741810Z adding: test/test-reports/python-pytest/profiler.test_cpp_thread/profiler.test_cpp_thread-0bbbdbec26a80ddb.xml (deflated 85%) 2025-08-14T23:00:20.8742724Z adding: test/test-reports/python-pytest/test_subclass/test_subclass-6002a503538d0b15.xml (deflated 28%) 2025-08-14T23:00:20.8743493Z adding: test/test-reports/python-pytest/test_subclass/test_subclass-c5a2d455792c1735.xml (deflated 97%) 2025-08-14T23:00:20.8744351Z adding: test/test-reports/python-pytest/profiler.test_profiler/profiler.test_profiler-f8b49d448c8594cf.xml (deflated 90%) 2025-08-14T23:00:20.8746253Z adding: test/test-reports/python-pytest/profiler.test_profiler/profiler.test_profiler-ba9220e83e7d6a9a.xml (deflated 93%) 2025-08-14T23:00:20.8747605Z adding: test/test-reports/python-pytest/functorch.test_vmap_registrations/functorch.test_vmap_registrations-b57162d768ca187a.xml (deflated 28%) 2025-08-14T23:00:20.8788786Z adding: test/test-reports/python-pytest/functorch.test_vmap_registrations/functorch.test_vmap_registrations-f4c13d3522889ea7.xml (deflated 98%) 2025-08-14T23:00:20.8789835Z adding: test/test-reports/python-pytest/nn.test_parametrization/nn.test_parametrization-5b1dda61dd96bec2.xml (deflated 28%) 2025-08-14T23:00:20.8790782Z adding: test/test-reports/python-pytest/nn.test_parametrization/nn.test_parametrization-4e5a02746769d999.xml (deflated 96%) 2025-08-14T23:00:20.8791648Z adding: test/test-reports/python-pytest/nn.test_pruning/nn.test_pruning-8a7b4f65053cf2e0.xml (deflated 28%) 2025-08-14T23:00:20.8792449Z adding: test/test-reports/python-pytest/nn.test_pruning/nn.test_pruning-239bef2368f38872.xml (deflated 93%) 2025-08-14T23:00:20.8793282Z adding: test/test-reports/python-pytest/functorch.test_dims/functorch.test_dims-aee844d319a9a8e5.xml (deflated 28%) 2025-08-14T23:00:20.8794495Z adding: test/test-reports/python-pytest/functorch.test_dims/functorch.test_dims-ee627c26826f8c61.xml (deflated 95%) 2025-08-14T23:00:20.8795298Z adding: test/test-reports/python-pytest/test_itt/test_itt-71720554be2e13ba.xml (deflated 28%) 2025-08-14T23:00:20.8796000Z adding: test/test-reports/python-pytest/test_itt/test_itt-707cebdbf052979b.xml (deflated 44%) 2025-08-14T23:00:20.8796967Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_multiarray/torch_np.numpy_tests.core.test_multiarray-c23ab75c76b77102.xml (deflated 28%) 2025-08-14T23:00:20.8807328Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_multiarray/torch_np.numpy_tests.core.test_multiarray-cb0728f5ed4a5a5f.xml (deflated 96%) 2025-08-14T23:00:20.8808442Z adding: test/test-reports/python-pytest/functorch.test_aotdispatch/functorch.test_aotdispatch-873b9bd029ae5f87.xml (deflated 28%) 2025-08-14T23:00:20.8821991Z adding: test/test-reports/python-pytest/functorch.test_aotdispatch/functorch.test_aotdispatch-dfaa5cfe1c9c75f0.xml (deflated 96%) 2025-08-14T23:00:20.8822934Z adding: test/test-reports/python-pytest/test_schema_check/test_schema_check-88c0c42cb3e8ea94.xml (deflated 28%) 2025-08-14T23:00:20.8963597Z adding: test/test-reports/python-pytest/test_schema_check/test_schema_check-d3bdca5d2b0be9f9.xml (deflated 99%) 2025-08-14T23:00:20.8964429Z adding: test/test-reports/python-pytest/test_datapipe/test_datapipe-a148d0860422352d.xml (deflated 28%) 2025-08-14T23:00:20.8966479Z adding: test/test-reports/python-pytest/test_datapipe/test_datapipe-516f7db1f2758eb3.xml (deflated 94%) 2025-08-14T23:00:20.8967368Z adding: test/test-reports/python-pytest/test_bundled_inputs/test_bundled_inputs-304785a8b33a7632.xml (deflated 28%) 2025-08-14T23:00:20.8968218Z adding: test/test-reports/python-pytest/test_bundled_inputs/test_bundled_inputs-06f466bb94085214.xml (deflated 89%) 2025-08-14T23:00:20.8969157Z adding: test/test-reports/python-pytest/profiler.test_execution_trace/profiler.test_execution_trace-8d65d5e4c4d2426b.xml (deflated 27%) 2025-08-14T23:00:20.8970180Z adding: test/test-reports/python-pytest/profiler.test_execution_trace/profiler.test_execution_trace-3765e4d07810d7a5.xml (deflated 91%) 2025-08-14T23:00:20.8971304Z adding: test/test-reports/python-pytest/lazy.test_generator/lazy.test_generator-29897c5f60f7c80a.xml (deflated 28%) 2025-08-14T23:00:20.8972225Z adding: test/test-reports/python-pytest/lazy.test_generator/lazy.test_generator-02c08076093f0f39.xml (deflated 64%) 2025-08-14T23:00:20.8973215Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_indexing/torch_np.numpy_tests.core.test_indexing-96c7468eeaae9402.xml (deflated 28%) 2025-08-14T23:00:20.8974361Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.core.test_indexing/torch_np.numpy_tests.core.test_indexing-ceadda83dbaa4e7c.xml (deflated 94%) 2025-08-14T23:00:20.8975364Z adding: test/test-reports/python-pytest/test_maskedtensor/test_maskedtensor-92f66f296a62b2cb.xml (deflated 28%) 2025-08-14T23:00:20.8996659Z adding: test/test-reports/python-pytest/test_maskedtensor/test_maskedtensor-0c4bbf6700367dc1.xml (deflated 99%) 2025-08-14T23:00:20.8997575Z adding: test/test-reports/python-pytest/test_tensorboard/test_tensorboard-3ef21a83c5ea8cd8.xml (deflated 28%) 2025-08-14T23:00:20.8998427Z adding: test/test-reports/python-pytest/test_tensorboard/test_tensorboard-c70c2679ad75ecb2.xml (deflated 94%) 2025-08-14T23:00:20.8999406Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.lib.test_type_check/torch_np.numpy_tests.lib.test_type_check-115bc4481da7b6b4.xml (deflated 28%) 2025-08-14T23:00:20.9000553Z adding: test/test-reports/python-pytest/torch_np.numpy_tests.lib.test_type_check/torch_np.numpy_tests.lib.test_type_check-a0db70ff988a9df7.xml (deflated 95%) 2025-08-14T23:00:20.9001548Z adding: test/test-reports/python-pytest/lazy.test_ts_opinfo/lazy.test_ts_opinfo-3dfdb162dd913768.xml (deflated 28%) 2025-08-14T23:00:20.9002477Z adding: test/test-reports/python-pytest/lazy.test_ts_opinfo/lazy.test_ts_opinfo-a9e03b30e1cb9d79.xml (deflated 77%) 2025-08-14T23:00:20.9003314Z adding: test/test-reports/python-pytest/test_mkldnn_fusion/test_mkldnn_fusion-9bd19ae3ce10ac8a.xml (deflated 28%) 2025-08-14T23:00:20.9004158Z adding: test/test-reports/python-pytest/test_mkldnn_fusion/test_mkldnn_fusion-aaa97d4ffabe4fbc.xml (deflated 81%) 2025-08-14T23:00:20.9005200Z adding: test/test-reports/python-pytest/benchmark_utils.test_benchmark_utils/benchmark_utils.test_benchmark_utils-2f0afeee79878356.xml (deflated 28%) 2025-08-14T23:00:20.9006316Z adding: test/test-reports/python-pytest/benchmark_utils.test_benchmark_utils/benchmark_utils.test_benchmark_utils-77aaa65222b1aaeb.xml (deflated 85%) 2025-08-14T23:00:20.9007320Z adding: test/test-reports/python-pytest/test_sparse_csr/test_sparse_csr-5bb73aa008d11e13.xml (deflated 28%) 2025-08-14T23:00:20.9064145Z adding: test/test-reports/python-pytest/test_sparse_csr/test_sparse_csr-47f9adeb1a6f74c3.xml (deflated 99%) 2025-08-14T23:00:20.9221965Z ##[group]Run # Remove any previous usage logs if they exist 2025-08-14T23:00:20.9222414Z # Remove any previous usage logs if they exist 2025-08-14T23:00:20.9222785Z rm -f logs-*.zip 2025-08-14T23:00:20.9223155Z zip "logs-${FILE_SUFFIX}.zip" 'usage_log.txt' || true 2025-08-14T23:00:20.9223639Z zip -r "logs-${FILE_SUFFIX}.zip" test/test-reports -i '*.log' || true 2025-08-14T23:00:20.9232804Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T23:00:20.9233167Z env: 2025-08-14T23:00:20.9233386Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:20.9233722Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:20.9234252Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:20.9234725Z DEVICE_NAME: 2025-08-14T23:00:20.9234953Z DEVICE_TYPE: 2025-08-14T23:00:20.9235313Z FILE_SUFFIX: test-slow-3-3-linux.g5.4xlarge.nvidia.gpu_48128162047 2025-08-14T23:00:20.9235712Z ##[endgroup] 2025-08-14T23:00:20.9554355Z adding: usage_log.txt (deflated 96%) 2025-08-14T23:00:20.9640470Z adding: test/test-reports/functorch.test_aotdispatch_1.1_5cea7e9107e60ff5_.log (deflated 52%) 2025-08-14T23:00:20.9641329Z adding: test/test-reports/export.test_torchbind_1.1_3403cc13c6c437a4_.log (deflated 50%) 2025-08-14T23:00:20.9642022Z adding: test/test-reports/test_mkl_verbose_1.1_24b447eeeaf7a91f_.log (deflated 54%) 2025-08-14T23:00:20.9642600Z adding: test/test-reports/test_testing_1.1_428bac4584b126f4_.log (deflated 49%) 2025-08-14T23:00:20.9643186Z adding: test/test-reports/test_schema_check_1.1_a31b39d6f01ff593_.log (deflated 50%) 2025-08-14T23:00:20.9643819Z adding: test/test-reports/inductor.test_aot_inductor_1.1_49cc9f651ce74b84_.log (deflated 51%) 2025-08-14T23:00:20.9644444Z adding: test/test-reports/test_autoload_1.1_d6478d761f330ede_.log (deflated 49%) 2025-08-14T23:00:20.9645101Z adding: test/test-reports/test_jit_fuser_te_1.1_aa945fb3085b14d9_.log (deflated 49%) 2025-08-14T23:00:20.9645683Z adding: test/test-reports/test_datapipe_1.1_400e526d20818677_.log (deflated 49%) 2025-08-14T23:00:20.9646283Z adding: test/test-reports/test_public_bindings_1.1_db83b09c1c31bc62_.log (deflated 50%) 2025-08-14T23:00:20.9647676Z adding: test/test-reports/test_hub_1.1_297d669373694752_.log (deflated 77%) 2025-08-14T23:00:20.9648894Z adding: test/test-reports/test_linalg_1.1_a9f2a3ed094707dd_.log (deflated 70%) 2025-08-14T23:00:20.9649650Z adding: test/test-reports/test_weak_1.1_7f7419b6f8c53683_.log (deflated 49%) 2025-08-14T23:00:20.9650524Z adding: test/test-reports/inductor.test_torchinductor_dynamic_shapes_1.2_e65a746773479ef9_.log (deflated 60%) 2025-08-14T23:00:20.9651211Z adding: test/test-reports/test_decomp_15.16_0f80eddad8cfd275_.log (deflated 49%) 2025-08-14T23:00:20.9652515Z adding: test/test-reports/inductor.test_torchinductor_codegen_dynamic_shapes_1.2_29e13ac9e09d87bf_.log (deflated 53%) 2025-08-14T23:00:20.9653387Z adding: test/test-reports/test_pruning_op_1.1_c3e5e9c9d8e5bb02_.log (deflated 50%) 2025-08-14T23:00:20.9654126Z adding: test/test-reports/inductor.test_torchinductor_codegen_dynamic_shapes_2.2_1c2b59265d1c83de_.log (deflated 61%) 2025-08-14T23:00:20.9654839Z adding: test/test-reports/test_hub_1.1_c7d0f73da0df29bb_.log (deflated 49%) 2025-08-14T23:00:20.9655486Z adding: test/test-reports/inductor.test_torchinductor_opinfo_2.11_cd00328892db7f00_.log (deflated 51%) 2025-08-14T23:00:20.9656672Z adding: test/test-reports/test_cuda_expandable_segments_1.1_dd97eb7e0b6f2299_.log (deflated 81%) 2025-08-14T23:00:20.9657392Z adding: test/test-reports/inductor.test_torchinductor_opinfo_5.11_3945404b3985866f_.log (deflated 51%) 2025-08-14T23:00:20.9658054Z adding: test/test-reports/test_stateless_1.1_1485ac0af2e6e52c_.log (deflated 49%) 2025-08-14T23:00:20.9658721Z adding: test/test-reports/inductor.test_torchinductor_opinfo_6.11_b0df501d68adc34e_.log (deflated 51%) 2025-08-14T23:00:20.9659402Z adding: test/test-reports/test_numpy_interop_1.1_2b9cd577bcd7d515_.log (deflated 50%) 2025-08-14T23:00:20.9660078Z adding: test/test-reports/inductor.test_torchinductor_opinfo_8.11_82e4b6f8d8de8356_.log (deflated 51%) 2025-08-14T23:00:20.9660755Z adding: test/test-reports/test_bundled_inputs_1.1_06597f0886aaeaf0_.log (deflated 50%) 2025-08-14T23:00:20.9661380Z adding: test/test-reports/inductor.test_cpu_repro_1.1_d9d8aae1e6c61cd7_.log (deflated 50%) 2025-08-14T23:00:20.9662010Z adding: test/test-reports/lazy.test_generator_1.1_a9c94ba676cba13c_.log (deflated 50%) 2025-08-14T23:00:20.9662635Z adding: test/test-reports/inductor.test_cuda_repro_1.1_55d9a103999b0117_.log (deflated 50%) 2025-08-14T23:00:20.9663312Z adding: test/test-reports/profiler.test_record_function_1.1_8e6c6182bc0475e9_.log (deflated 51%) 2025-08-14T23:00:20.9664017Z adding: test/test-reports/inductor.test_kernel_benchmark_1.1_df2d6f139dcc92f4_.log (deflated 51%) 2025-08-14T23:00:20.9678701Z adding: test/test-reports/test_type_hints_1.1_9c65717c87e2b157_.log (deflated 50%) 2025-08-14T23:00:20.9679526Z adding: test/test-reports/inductor.test_cudagraph_trees_1.1_a3fe4b5b8ef2b54b_.log (deflated 51%) 2025-08-14T23:00:20.9680193Z adding: test/test-reports/test_maskedtensor_1.1_5ab0512779c9742d_.log (deflated 50%) 2025-08-14T23:00:20.9681077Z adding: test/test-reports/dynamo.test_dynamic_shapes_1.2_6affaaed98f775d6_.log (deflated 57%) 2025-08-14T23:00:20.9681710Z adding: test/test-reports/test_tensorboard_1.1_5a3038c31f43a006_.log (deflated 50%) 2025-08-14T23:00:20.9682340Z adding: test/test-reports/dynamo.test_dynamic_shapes_2.2_489e15cdbe712c04_.log (deflated 50%) 2025-08-14T23:00:20.9682970Z adding: test/test-reports/test_proxy_tensor_1.1_e9eff395dc8ac999_.log (deflated 92%) 2025-08-14T23:00:20.9683564Z adding: test/test-reports/dynamo.test_unspec_1.1_77c4b5228e9df64e_.log (deflated 50%) 2025-08-14T23:00:20.9686149Z adding: test/test-reports/test_quantization_1.4_e7c77d5043e3495a_.log (deflated 90%) 2025-08-14T23:00:20.9686772Z adding: test/test-reports/dynamo.test_repros_1.1_6ad3aa68bd0162ed_.log (deflated 51%) 2025-08-14T23:00:20.9687399Z adding: test/test-reports/nn.test_lazy_modules_1.1_c1fc0e0d51b9b320_.log (deflated 50%) 2025-08-14T23:00:20.9688061Z adding: test/test-reports/inductor.test_pattern_matcher_1.1_a575b490b2df0e9d_.log (deflated 51%) 2025-08-14T23:00:20.9688744Z adding: test/test-reports/functorch.test_logging_1.1_0831338a54dca243_.log (deflated 50%) 2025-08-14T23:00:20.9689449Z adding: test/test-reports/inductor.test_mkldnn_pattern_matcher_1.1_a4228dd5cc3029fb_.log (deflated 51%) 2025-08-14T23:00:20.9690154Z adding: test/test-reports/test_segment_reductions_1.1_1bccc52652b6229f_.log (deflated 50%) 2025-08-14T23:00:20.9690843Z adding: test/test-reports/inductor.test_extension_backend_1.1_2dbcc54c737b0136_.log (deflated 51%) 2025-08-14T23:00:20.9691488Z adding: test/test-reports/test_license_1.1_b42e3f3f0e8a4875_.log (deflated 51%) 2025-08-14T23:00:20.9692155Z adding: test/test-reports/inductor.test_perf_1.1_f500618ae9486d87_.log (deflated 50%) 2025-08-14T23:00:20.9692771Z adding: test/test-reports/lazy.test_ts_opinfo_1.1_97dc335a0cbefc26_.log (deflated 50%) 2025-08-14T23:00:20.9693400Z adding: test/test-reports/inductor.test_foreach_1.1_a8f0558397ce3279_.log (deflated 50%) 2025-08-14T23:00:20.9696388Z adding: test/test-reports/test_ao_sparsity_1.1_9bf0e22d6c465b9f_.log (deflated 90%) 2025-08-14T23:00:20.9697053Z adding: test/test-reports/dynamo.test_modules_1.1_836ba8c61d988e49_.log (deflated 50%) 2025-08-14T23:00:20.9697666Z adding: test/test-reports/test_import_stats_1.1_f0a6a41b5537e1c2_.log (deflated 50%) 2025-08-14T23:00:20.9698313Z adding: test/test-reports/inductor.test_triton_wrapper_1.1_57001427517ef3fd_.log (deflated 51%) 2025-08-14T23:00:20.9698941Z adding: test/test-reports/test_logging_1.1_acc27cb618aa62db_.log (deflated 49%) 2025-08-14T23:00:20.9699616Z adding: test/test-reports/inductor.test_coordinate_descent_tuner_1.1_5dbc4fcc6d8ccd5c_.log (deflated 51%) 2025-08-14T23:00:20.9700294Z adding: test/test-reports/test_masked_1.1_684ca6949ae8b806_.log (deflated 49%) 2025-08-14T23:00:20.9700924Z adding: test/test-reports/inductor.test_select_algorithm_1.1_26088e8d5723f721_.log (deflated 51%) 2025-08-14T23:00:20.9701582Z adding: test/test-reports/test_mkldnn_fusion_1.1_afce9695227f85d8_.log (deflated 50%) 2025-08-14T23:00:20.9702234Z adding: test/test-reports/inductor.test_dependencies_1.1_eedb6024cec1129b_.log (deflated 51%) 2025-08-14T23:00:20.9702883Z adding: test/test-reports/test_cuda_sanitizer_1.1_62790a170bba1501_.log (deflated 50%) 2025-08-14T23:00:20.9703547Z adding: test/test-reports/inductor.test_compiled_autograd_1.2_ff307f387c036eaa_.log (deflated 51%) 2025-08-14T23:00:20.9704203Z adding: test/test-reports/optim.test_swa_utils_1.1_27c063586cf40236_.log (deflated 36%) 2025-08-14T23:00:20.9704828Z adding: test/test-reports/inductor.test_halide_1.1_20c94affc301161d_.log (deflated 35%) 2025-08-14T23:00:20.9705432Z adding: test/test-reports/test_out_dtype_op_1.1_1888ec089aaec0d8_.log (deflated 50%) 2025-08-14T23:00:20.9706111Z adding: test/test-reports/inductor.test_triton_extension_backend_1.1_45e669d54ef88616_.log (deflated 51%) 2025-08-14T23:00:20.9706949Z adding: test/test-reports/test_sparse_csr_2.2_70d5fc485050275a_.log (deflated 49%) 2025-08-14T23:00:20.9707644Z adding: test/test-reports/inductor.test_autoheuristic_1.1_10c8e3ab7860f4ed_.log (deflated 50%) 2025-08-14T23:00:20.9734254Z adding: test/test-reports/export.test_torchbind_1.1_565cb737b1990be8_.log (deflated 97%) 2025-08-14T23:00:20.9734906Z adding: test/test-reports/inductor.test_mps_basic_1.1_6369e8c9d1e3e816_.log (deflated 35%) 2025-08-14T23:00:20.9735519Z adding: test/test-reports/test_monitor_1.1_65b571211c6349bc_.log (deflated 49%) 2025-08-14T23:00:20.9736152Z adding: test/test-reports/dynamo.test_precompile_context_1.1_d12ecc8d077c3000_.log (deflated 51%) 2025-08-14T23:00:20.9736797Z adding: test/test-reports/test_complex_1.1_5282aa5f14d98d61_.log (deflated 49%) 2025-08-14T23:00:20.9737730Z adding: test/test-reports/dynamo.test_cudagraphs_expandable_segments_1.1_2b73e64dd0faa39c_.log (deflated 52%) 2025-08-14T23:00:20.9760443Z adding: test/test-reports/inductor.test_cpu_repro_1.1_b2551ab843b6cf3f_.log (deflated 97%) 2025-08-14T23:00:20.9761137Z adding: test/test-reports/dynamo.test_guard_manager_1.1_cbdf2edbea03f0a9_.log (deflated 51%) 2025-08-14T23:00:20.9807253Z adding: test/test-reports/test_testing_1.1_a4c760f0195db170_.log (deflated 96%) 2025-08-14T23:00:20.9807894Z adding: test/test-reports/test_sparse_semi_structured_1.1_4492803570b5cb8a_.log (deflated 51%) 2025-08-14T23:00:20.9836957Z adding: test/test-reports/test_linalg_1.1_f8db1d9fd815e06e_.log (deflated 95%) 2025-08-14T23:00:20.9837574Z adding: test/test-reports/export.test_serialize_1.1_381bdbeaf5980b0e_.log (deflated 50%) 2025-08-14T23:00:20.9838250Z adding: test/test-reports/profiler.test_cpp_thread_1.1_dfcd5c6b4128e72b_.log (deflated 56%) 2025-08-14T23:00:20.9839310Z adding: test/test-reports/dynamo.test_structured_trace_1.1_ad7791956a1e129f_.log (deflated 51%) 2025-08-14T23:00:20.9839994Z adding: test/test-reports/test_public_bindings_1.1_41f4f8e50525fdff_.log (deflated 62%) 2025-08-14T23:00:20.9840627Z adding: test/test-reports/export.test_verifier_1.1_cca23f96c748147f_.log (deflated 50%) 2025-08-14T23:00:20.9978818Z adding: test/test-reports/test_jit_fuser_te_1.1_a60f594d95c3cac2_.log (deflated 96%) 2025-08-14T23:00:20.9979460Z adding: test/test-reports/dynamo.test_recompile_ux_1.1_5192c675fc088558_.log (deflated 50%) 2025-08-14T23:00:21.0003471Z adding: test/test-reports/inductor.test_aot_inductor_1.1_ed46402d2e231278_.log (deflated 94%) 2025-08-14T23:00:21.0004113Z adding: test/test-reports/dynamo.test_minifier_1.1_3616437461683997_.log (deflated 50%) 2025-08-14T23:00:21.0004771Z adding: test/test-reports/test_binary_ufuncs_1.1_ba22464ca86353a9_.log (deflated 49%) 2025-08-14T23:00:21.0005654Z adding: test/test-reports/inductor.test_cudagraph_trees_expandable_segments_1.1_2393cada8ea081e6_.log (deflated 52%) 2025-08-14T23:00:21.0006403Z adding: test/test-reports/test_subclass_1.1_06cbb05cd8ef5ca9_.log (deflated 49%) 2025-08-14T23:00:21.0007125Z adding: test/test-reports/test_functionalization_of_rng_ops_1.1_f2ac459ebeb667b7_.log (deflated 51%) 2025-08-14T23:00:21.0007794Z adding: test/test-reports/test_file_check_1.1_e1bcccea08b53b5a_.log (deflated 53%) 2025-08-14T23:00:21.0008408Z adding: test/test-reports/dynamo.test_hooks_1.1_eed85e2d259511ff_.log (deflated 50%) 2025-08-14T23:00:21.0010839Z adding: test/test-reports/inductor.test_cuda_repro_1.1_dd7de682da51f01b_.log (deflated 88%) 2025-08-14T23:00:21.0011510Z adding: test/test-reports/dynamo.test_generator_1.1_07c9831a7f3865b7_.log (deflated 50%) 2025-08-14T23:00:21.0012822Z adding: test/test-reports/dynamo.test_unspec_1.1_c7274b4f8cf7bf0a_.log (deflated 86%) 2025-08-14T23:00:21.0013587Z adding: test/test-reports/dynamo.test_reorder_logs_1.1_4a05519ab7b741a8_.log (deflated 50%) 2025-08-14T23:00:21.0044267Z adding: test/test-reports/dynamo.test_dynamic_shapes_1.2_6173a3b68ef2ec1a_.log (deflated 93%) 2025-08-14T23:00:21.0045007Z adding: test/test-reports/dynamo.test_subclasses_1.1_25bbe11b539e35b7_.log (deflated 50%) 2025-08-14T23:00:21.0429521Z adding: test/test-reports/test_transformers_1.1_3eb8e83083563ca4_.log (deflated 98%) 2025-08-14T23:00:21.0430904Z adding: test/test-reports/export.test_sparse_1.1_062c01233bbe2a29_.log (deflated 50%) 2025-08-14T23:00:21.0432193Z adding: test/test-reports/profiler.test_profiler_1.1_2fece71adee404bc_.log (deflated 78%) 2025-08-14T23:00:21.0433563Z adding: test/test-reports/functorch.test_eager_transforms_1.1_1c868ebc237dc36c_.log (deflated 51%) 2025-08-14T23:00:21.0440423Z adding: test/test-reports/dynamo.test_repros_1.1_4d3d01defeadd90c_.log (deflated 89%) 2025-08-14T23:00:21.0441089Z adding: test/test-reports/export.test_draft_export_1.1_209b8ecd104b73f2_.log (deflated 51%) 2025-08-14T23:00:21.0441790Z adding: test/test-reports/inductor.test_dependencies_1.1_66551d8c8b89fd5c_.log (deflated 67%) 2025-08-14T23:00:21.0442452Z adding: test/test-reports/dynamo.test_comptime_1.1_dbfcdeda804ffc40_.log (deflated 50%) 2025-08-14T23:00:21.0473141Z adding: test/test-reports/dynamo.test_dynamic_shapes_2.2_40957ef1725ed7e7_.log (deflated 93%) 2025-08-14T23:00:21.0473857Z adding: test/test-reports/dynamo.test_python_autograd_1.1_b0d85c8deeb69608_.log (deflated 51%) 2025-08-14T23:00:21.0550606Z adding: test/test-reports/test_foreach_1.1_0ddcf304dd3b0413_.log (deflated 97%) 2025-08-14T23:00:21.0551227Z adding: test/test-reports/test_quantization_1.4_9d99d25d6b9bebf5_.log (deflated 58%) 2025-08-14T23:00:21.0552049Z adding: test/test-reports/test_typing_1.1_266fde289d6a6dd0_.log (deflated 77%) 2025-08-14T23:00:21.0552695Z adding: test/test-reports/test_dynamic_shapes_1.1_f9cb3f8d1fac3562_.log (deflated 50%) 2025-08-14T23:00:21.0553497Z adding: test/test-reports/lazy.test_functionalization_1.1_756de0a3be111ef5_.log (deflated 51%) 2025-08-14T23:00:21.0554340Z adding: test/test-reports/higher_order_ops.test_invoke_subgraph_1.1_b76a8da441fbc091_.log (deflated 51%) 2025-08-14T23:00:21.0555184Z adding: test/test-reports/test_flop_counter_1.1_25bd73df2d211d23_.log (deflated 79%) 2025-08-14T23:00:21.0555950Z adding: test/test-reports/test_package_1.1_a76776829667ac1d_.log (deflated 56%) 2025-08-14T23:00:21.0559923Z adding: test/test-reports/dynamo.test_modules_1.1_4817c0227736a0db_.log (deflated 90%) 2025-08-14T23:00:21.0560603Z adding: test/test-reports/functorch.test_rearrange_1.1_7502235f9b58e1be_.log (deflated 51%) 2025-08-14T23:00:21.0562711Z adding: test/test-reports/inductor.test_perf_1.1_3ceddc46beaa51ca_.log (deflated 88%) 2025-08-14T23:00:21.0563383Z adding: test/test-reports/functorch.test_parsing_1.1_c39affa5939ca9f1_.log (deflated 50%) 2025-08-14T23:00:21.1296206Z adding: test/test-reports/test_meta_1.1_09f4c36b551bfcb4_.log (deflated 97%) 2025-08-14T23:00:21.1296803Z adding: test/test-reports/test_hop_infra_1.1_d1bb2d795a77e2ba_.log (deflated 49%) 2025-08-14T23:00:21.1307918Z adding: test/test-reports/inductor.test_foreach_1.1_0804f39aba0fa162_.log (deflated 95%) 2025-08-14T23:00:21.1308577Z adding: test/test-reports/test_comparison_utils_1.1_1b8b221e45cbcafa_.log (deflated 50%) 2025-08-14T23:00:21.1309229Z adding: test/test-reports/inductor.test_halide_1.1_95b2b4243c683e1d_.log (deflated 35%) 2025-08-14T23:00:21.1309897Z adding: test/test-reports/functorch.test_control_flow_1.1_f1c17bc87f00f757_.log (deflated 51%) 2025-08-14T23:00:21.1310861Z adding: test/test-reports/dynamo.test_guard_manager_1.1_20c705b7bd535116_.log (deflated 85%) 2025-08-14T23:00:21.1311525Z adding: test/test-reports/functorch.test_ac_logging_1.1_28fd3d4086bed1e0_.log (deflated 50%) 2025-08-14T23:00:21.1312212Z adding: test/test-reports/test_openmp_1.1_4ed83f0c682194e2_.log (deflated 53%) 2025-08-14T23:00:21.1312839Z adding: test/test-reports/test_mkl_verbose_1.1_f614497f0410631b_.log (deflated 50%) 2025-08-14T23:00:21.1313542Z adding: test/test-reports/functorch.test_vmap_registrations_1.1_80a022d446ccf29d_.log (deflated 51%) 2025-08-14T23:00:21.1314271Z adding: test/test-reports/test_appending_byte_serializer_1.1_bad8dde96e09343c_.log (deflated 51%) 2025-08-14T23:00:21.1317667Z adding: test/test-reports/test_pytree_1.1_b9f85386d888d755_.log (deflated 89%) 2025-08-14T23:00:21.1318341Z adding: test/test-reports/test_autoload_1.1_292d270800684cbb_.log (deflated 49%) 2025-08-14T23:00:21.1319408Z adding: test/test-reports/test_fx_passes_1.1_c6b0378745071a6a_.log (deflated 89%) 2025-08-14T23:00:21.1320049Z adding: test/test-reports/test_proxy_tensor_1.1_b158154d5a1d679d_.log (deflated 50%) 2025-08-14T23:00:21.1320675Z adding: test/test-reports/xpu.test_gemm_1.1_2e73dd04997b20e6_.log (deflated 49%) 2025-08-14T23:00:21.1321342Z adding: test/test-reports/test_mkldnn_verbose_1.1_34ab693c2987bebd_.log (deflated 50%) 2025-08-14T23:00:21.1321982Z adding: test/test-reports/inductor.test_mps_basic_1.1_25f4b06debb1e5f4_.log (deflated 35%) 2025-08-14T23:00:21.1322659Z adding: test/test-reports/test_utils_config_module_1.1_bd5ae5652df018f0_.log (deflated 50%) 2025-08-14T23:00:21.1323321Z adding: test/test-reports/export.test_verifier_1.1_5ba982a4127f2374_.log (deflated 73%) 2025-08-14T23:00:21.1324094Z adding: test/test-reports/test_functionalization_1.1_fb23d506bf758cc2_.log (deflated 50%) 2025-08-14T23:00:21.1325017Z adding: test/test-reports/functorch.test_ac_1.1_a6afdf5eb339bcc1_.log (deflated 50%) 2025-08-14T23:00:21.1325730Z adding: test/test-reports/test_rename_privateuse1_to_existing_device_1.1_2d86974612dfb5d2_.log (deflated 52%) 2025-08-14T23:00:21.1328050Z adding: test/test-reports/test_namedtensor_1.1_36a38d250cecc506_.log (deflated 88%) 2025-08-14T23:00:21.1328677Z adding: test/test-reports/test_transformers_1.1_b0c8acb976b034f3_.log (deflated 50%) 2025-08-14T23:00:21.1366572Z adding: test/test-reports/test_nestedtensor_1.1_222557c6a035edc5_.log (deflated 96%) 2025-08-14T23:00:21.1367982Z adding: test/test-reports/test_ao_sparsity_1.1_9b1fb0198dc0af7f_.log (deflated 50%) 2025-08-14T23:00:21.1380771Z adding: test/test-reports/test_decomp_6.16_9cd5d861c9a0acea_.log (deflated 93%) 2025-08-14T23:00:21.1381366Z adding: test/test-reports/test_license_1.1_d0d720b3950d939a_.log (deflated 49%) 2025-08-14T23:00:21.1384353Z adding: test/test-reports/export.test_serialize_1.1_fab45c99fd420d5a_.log (deflated 89%) 2025-08-14T23:00:21.1385028Z adding: test/test-reports/torch_np.test_binary_ufuncs_1.1_320e65142ee686e8_.log (deflated 51%) 2025-08-14T23:00:21.1385856Z adding: test/test-reports/dynamo.test_recompile_ux_1.1_4520745ddbe41221_.log (deflated 73%) 2025-08-14T23:00:21.1386544Z adding: test/test-reports/torch_np.test_unary_ufuncs_1.1_76b5bdb4001fef58_.log (deflated 51%) 2025-08-14T23:00:21.1401210Z adding: test/test-reports/test_decomp_14.16_9a19e0d9365fd668_.log (deflated 93%) 2025-08-14T23:00:21.1401789Z adding: test/test-reports/test_foreach_1.1_badc3767e4b1644b_.log (deflated 49%) 2025-08-14T23:00:21.1402608Z adding: test/test-reports/dynamo.test_minifier_1.1_14ea3adfa1f435f4_.log (deflated 81%) 2025-08-14T23:00:21.1403261Z adding: test/test-reports/test_expanded_weights_1.1_2d9d6939094d86e9_.log (deflated 51%) 2025-08-14T23:00:21.1403960Z adding: test/test-reports/nn.test_parametrization_1.1_84bd565bf937ddee_.log (deflated 50%) 2025-08-14T23:00:21.1404747Z adding: test/test-reports/typing.test_python_operators_1.1_68470c26d8430657_.log (deflated 51%) 2025-08-14T23:00:21.1406112Z adding: test/test-reports/dynamo.test_hooks_1.1_8d67aa3ed2424294_.log (deflated 84%) 2025-08-14T23:00:21.1406860Z adding: test/test-reports/test_fx_experimental_1.1_a8cce33eaa1086ce_.log (deflated 50%) 2025-08-14T23:00:21.1414351Z adding: test/test-reports/test_jiterator_1.1_aa71ca629a29c84e_.log (deflated 96%) 2025-08-14T23:00:21.1414954Z adding: test/test-reports/test_file_check_1.1_c90f937f7159c024_.log (deflated 50%) 2025-08-14T23:00:21.1436283Z adding: test/test-reports/test_optim_1.1_8cc0d005484d4aeb_.log (deflated 95%) 2025-08-14T23:00:21.1436914Z adding: test/test-reports/torch_np.test_dtype_1.1_abb4dda8192a1528_.log (deflated 50%) 2025-08-14T23:00:21.1437527Z adding: test/test-reports/nn.test_pruning_1.1_3201999c0c2b5294_.log (deflated 50%) 2025-08-14T23:00:21.1438339Z adding: test/test-reports/higher_order_ops.test_with_effects_1.1_352a8a0d1c12b1d4_.log (deflated 51%) 2025-08-14T23:00:21.1440789Z adding: test/test-reports/dynamo.test_generator_1.1_625062984680acce_.log (deflated 89%) 2025-08-14T23:00:21.1441430Z adding: test/test-reports/test_native_functions_1.1_e97c2556dc14a754_.log (deflated 50%) 2025-08-14T23:00:21.1442069Z adding: test/test-reports/dynamo.test_reorder_logs_1.1_05c8f33663369f30_.log (deflated 80%) 2025-08-14T23:00:21.1442726Z adding: test/test-reports/functorch.test_minifier_1.1_e5c6449634bc41ef_.log (deflated 50%) 2025-08-14T23:00:21.1443367Z adding: test/test-reports/functorch.test_ac_1.1_84e4bd3b5e506236_.log (deflated 71%) 2025-08-14T23:00:21.1444043Z adding: test/test-reports/test_flop_counter_1.1_fa3f55f5fd24ddc9_.log (deflated 50%) 2025-08-14T23:00:21.1444737Z adding: test/test-reports/functorch.test_dims_1.1_f4585f7fb9c7e885_.log (deflated 50%) 2025-08-14T23:00:21.1445760Z adding: test/test-reports/torch_np.test_nep50_examples_1.1_2387ae9205898e1c_.log (deflated 51%) 2025-08-14T23:00:21.1446609Z adding: test/test-reports/test_itt_1.1_f246171f6883ebb7_.log (deflated 49%) 2025-08-14T23:00:21.1447479Z adding: test/test-reports/higher_order_ops.test_invoke_quant_1.1_e1ac74ec1b803681_.log (deflated 51%) 2025-08-14T23:00:21.1451302Z adding: test/test-reports/dynamo.test_subclasses_1.1_94f61300d483e594_.log (deflated 91%) 2025-08-14T23:00:21.1451953Z adding: test/test-reports/test_per_overload_api_1.1_41d416732ec83204_.log (deflated 50%) 2025-08-14T23:00:21.1456697Z adding: test/test-reports/export.test_sparse_1.1_4c590277200c60e8_.log (deflated 95%) 2025-08-14T23:00:21.1457398Z adding: test/test-reports/test_tensorexpr_pybind_1.1_b4c14d6fcc87ecc7_.log (deflated 50%) 2025-08-14T23:00:21.1458112Z adding: test/test-reports/test_pytree_1.1_ec340c49cd9d473f_.log (deflated 50%) 2025-08-14T23:00:21.1472308Z adding: test/test-reports/test_decomp_10.16_8e4ad3c724df4afa_.log (deflated 93%) 2025-08-14T23:00:21.1472904Z adding: test/test-reports/test_fx_passes_1.1_1c67b10632292282_.log (deflated 59%) 2025-08-14T23:00:21.1487703Z adding: test/test-reports/test_decomp_1.16_5087982246a9bfca_.log (deflated 93%) 2025-08-14T23:00:21.1488310Z adding: test/test-reports/test_module_tracker_1.1_b7fd2034efdc73f1_.log (deflated 50%) 2025-08-14T23:00:21.1488909Z adding: test/test-reports/test_typing_1.1_6e540a0c3fe3fbc3_.log (deflated 49%) 2025-08-14T23:00:21.1489474Z adding: test/test-reports/test_openmp_1.1_72ce3994a9a72144_.log (deflated 49%) 2025-08-14T23:00:21.1490051Z adding: test/test-reports/test_matmul_cuda_1.1_f60c34556e938061_.log (deflated 49%) 2025-08-14T23:00:21.1490753Z adding: test/test-reports/torch_np.numpy_tests.core.test_scalarinherit_1.1_ec9ce57b68ffd52a_.log (deflated 52%) 2025-08-14T23:00:21.1491704Z adding: test/test-reports/export.test_draft_export_1.1_aa5672157c894c89_.log (deflated 81%) 2025-08-14T23:00:21.1492425Z adding: test/test-reports/functorch.test_ac_knapsack_1.1_e99acd62d792888f_.log (deflated 51%) 2025-08-14T23:00:21.1493268Z adding: test/test-reports/dynamo.test_comptime_1.1_2382325d9cd30383_.log (deflated 74%) 2025-08-14T23:00:21.1493923Z adding: test/test-reports/backends.xeon.test_launch_1.1_5b8208ef42906249_.log (deflated 50%) 2025-08-14T23:00:21.1507912Z adding: test/test-reports/test_decomp_15.16_30431feb9eab4ba1_.log (deflated 93%) 2025-08-14T23:00:21.1508602Z adding: test/test-reports/test_nestedtensor_1.1_03c1ce670700ec21_.log (deflated 50%) 2025-08-14T23:00:21.1551953Z adding: test/test-reports/test_matmul_cuda_1.1_6f15f34674744d0e_.log (deflated 97%) 2025-08-14T23:00:21.1553032Z adding: test/test-reports/test_namedtensor_1.1_0d923a3a66c31934_.log (deflated 50%) 2025-08-14T23:00:21.1554104Z adding: test/test-reports/test_pruning_op_1.1_cdbfd65a1ffeb368_.log (deflated 55%) 2025-08-14T23:00:21.1555148Z adding: test/test-reports/xpu.test_gemm_1.1_238f31183018d989_.log (deflated 49%) 2025-08-14T23:00:21.1556231Z adding: test/test-reports/test_utils_config_module_1.1_fd8e3bd90c7dceea_.log (deflated 82%) 2025-08-14T23:00:21.1557611Z adding: test/test-reports/torch_np.test_ufuncs_basic_1.1_1744d4fb3bce5f13_.log (deflated 51%) 2025-08-14T23:00:21.1558210Z adding: test/test-reports/test_meta_1.1_2425b18fc48f3157_.log (deflated 50%) 2025-08-14T23:00:21.1558746Z adding: test/test-reports/test_weak_1.1_e2954cfebf7b943d_.log (deflated 84%) 2025-08-14T23:00:21.1559307Z adding: test/test-reports/test_utils_filelock_1.1_687f19e4b3d82991_.log (deflated 50%) 2025-08-14T23:00:21.1562185Z adding: test/test-reports/test_package_1.1_99aedd6e8acbed92_.log (deflated 89%) 2025-08-14T23:00:21.1562777Z adding: test/test-reports/torch_np.test_random_1.1_da908466fd69853a_.log (deflated 50%) 2025-08-14T23:00:21.1572065Z adding: test/test-reports/test_dynamic_shapes_1.1_84d10da10575cb0c_.log (deflated 94%) 2025-08-14T23:00:21.1572687Z adding: test/test-reports/profiler.test_kineto_1.1_f804d73d6e13a2b5_.log (deflated 50%) 2025-08-14T23:00:21.1573365Z adding: test/test-reports/functorch.test_rearrange_1.1_68d46f8f3fe8a6c3_.log (deflated 73%) 2025-08-14T23:00:21.1574040Z adding: test/test-reports/test_compile_benchmark_util_1.1_76645234b3b0d907_.log (deflated 52%) 2025-08-14T23:00:21.1574744Z adding: test/test-reports/test_logging_1.1_753319c24f227905_.log (deflated 48%) 2025-08-14T23:00:21.1575413Z adding: test/test-reports/test_jiterator_1.1_46e7ac5486686c8d_.log (deflated 49%) 2025-08-14T23:00:21.1576013Z adding: test/test-reports/test_optim_1.1_19eaa96a8d4541b8_.log (deflated 49%) 2025-08-14T23:00:21.1576980Z adding: test/test-reports/test_out_dtype_op_1.1_6ce46e0c26e8f090_.log (deflated 75%) 2025-08-14T23:00:21.1577613Z adding: test/test-reports/test_decomp_1.16_b0b09544b76e1c63_.log (deflated 49%) 2025-08-14T23:00:21.1834392Z adding: test/test-reports/test_binary_ufuncs_1.1_446d3fa2c7f05fb6_.log (deflated 97%) 2025-08-14T23:00:21.1834983Z adding: test/test-reports/test_decomp_5.16_c11d4d109966cd6c_.log (deflated 49%) 2025-08-14T23:00:21.1835539Z adding: test/test-reports/test_itt_1.1_fd531bec2e9d2c9e_.log (deflated 48%) 2025-08-14T23:00:21.1836085Z adding: test/test-reports/test_decomp_6.16_b1f9c052ce035092_.log (deflated 49%) 2025-08-14T23:00:21.1836968Z adding: test/test-reports/test_complex_1.1_db60d7a66c37f082_.log (deflated 78%) 2025-08-14T23:00:21.1837573Z adding: test/test-reports/test_decomp_10.16_94caaf3c5dab727a_.log (deflated 49%) 2025-08-14T23:00:21.1853044Z adding: test/test-reports/test_decomp_5.16_5ce3b51767e4ff8d_.log (deflated 93%) 2025-08-14T23:00:21.1853625Z adding: test/test-reports/test_decomp_14.16_37f87943fae2d21b_.log (deflated 49%) 2025-08-14T23:00:21.1854278Z adding: test/test-reports/inductor.test_extension_backend_1.1_0c911183e2b98bf4_.log (deflated 56%) 2025-08-14T23:00:21.1856057Z adding: test/test-reports/cpp_extensions.python_agnostic_extension.test.test_python_agnostic_1.1_eb61b89e110b1c28_.log (deflated 69%) 2025-08-14T23:00:21.1856889Z adding: test/test-reports/test_ops_fwd_gradients_1.1_d177a1f08e7c3620_.log (deflated 50%) 2025-08-14T23:00:21.1857600Z adding: test/test-reports/torch_np.numpy_tests.core.test_einsum_1.1_65832b144ad4ec28_.log (deflated 52%) 2025-08-14T23:00:21.1882997Z adding: test/test-reports/test_ops_jit_1.1_7e33795140610277_.log (deflated 95%) 2025-08-14T23:00:21.1884023Z adding: test/test-reports/test_ops_jit_1.1_7a4c4689e35378e5_.log (deflated 49%) 2025-08-14T23:00:21.1885411Z adding: test/test-reports/functorch.test_parsing_1.1_00b87e2aad5f0892_.log (deflated 75%) 2025-08-14T23:00:21.1886992Z adding: test/test-reports/lazy.test_step_closures_1.1_707e7d67edacdfcf_.log (deflated 50%) 2025-08-14T23:00:21.1996212Z adding: test/test-reports/test_ops_gradients_1.1_956ae068c06984d0_.log (deflated 96%) 2025-08-14T23:00:21.1997188Z adding: test/test-reports/test_ops_gradients_1.1_0ad2ea512b3975e7_.log (deflated 50%) 2025-08-14T23:00:21.1998749Z adding: test/test-reports/test_hop_infra_1.1_3c5afc8550f98874_.log (deflated 57%) 2025-08-14T23:00:21.2000052Z adding: test/test-reports/test_fx_reinplace_pass_1.1_65f9ccd1ae07cca1_.log (deflated 50%) 2025-08-14T23:00:21.2002939Z adding: test/test-reports/torch_np.test_unary_ufuncs_1.1_cae29295b5303a0d_.log (deflated 87%) 2025-08-14T23:00:21.2004670Z adding: test/test-reports/nn.test_multihead_attention_1.1_999f13a34bd54af0_.log (deflated 51%) 2025-08-14T23:00:21.2006321Z adding: test/test-reports/functorch.test_aot_joint_with_descriptors_1.1_3e7d1f7d3736d426_.log (deflated 52%) 2025-08-14T23:00:21.2007506Z adding: test/test-reports/test_set_default_mobile_cpu_allocator_1.1_b468067168c6b3e6_.log (deflated 51%) 2025-08-14T23:00:21.2008365Z adding: test/test-reports/torch_np.numpy_tests.core.test_multiarray_1.2_2e05f2b6d13b8a81_.log (deflated 52%) 2025-08-14T23:00:21.2009221Z adding: test/test-reports/profiler.test_execution_trace_1.1_9a7e7c6ddf7adba9_.log (deflated 51%) 2025-08-14T23:00:21.2010108Z adding: test/test-reports/torch_np.numpy_tests.core.test_indexing_1.1_a00be2fe555d7ac0_.log (deflated 52%) 2025-08-14T23:00:21.2010924Z adding: test/test-reports/torch_np.numpy_tests.lib.test_type_check_1.1_6ae81ac222450123_.log (deflated 52%) 2025-08-14T23:00:21.2011790Z adding: test/test-reports/benchmark_utils.test_benchmark_utils_1.1_8c41a580c2bf2712_.log (deflated 52%) 2025-08-14T23:00:21.2030664Z adding: test/test-reports/inductor.test_torchinductor_codegen_dynamic_shapes_2.2_c50a59be0fe94b5e_.log (deflated 95%) 2025-08-14T23:00:21.2056067Z adding: test/test-reports/inductor.test_torchinductor_codegen_dynamic_shapes_1.2_6e921a24d544b630_.log (deflated 95%) 2025-08-14T23:00:21.2065959Z adding: test/test-reports/inductor.test_torchinductor_opinfo_5.11_5f480f72202a5622_.log (deflated 93%) 2025-08-14T23:00:21.2090821Z adding: test/test-reports/inductor.test_torchinductor_dynamic_shapes_1.2_8f848d6bbef141cc_.log (deflated 95%) 2025-08-14T23:00:21.2101327Z adding: test/test-reports/inductor.test_torchinductor_opinfo_2.11_2e7a133ca1cb6b1c_.log (deflated 94%) 2025-08-14T23:00:21.2110280Z adding: test/test-reports/inductor.test_torchinductor_opinfo_8.11_958e156557c89031_.log (deflated 93%) 2025-08-14T23:00:21.2111252Z adding: test/test-reports/inductor.test_kernel_benchmark_1.1_40eaa2f5eb381aa6_.log (deflated 80%) 2025-08-14T23:00:21.2116855Z adding: test/test-reports/inductor.test_cudagraph_trees_1.1_d7f34b4635e93142_.log (deflated 92%) 2025-08-14T23:00:21.2118322Z adding: test/test-reports/inductor.test_pattern_matcher_1.1_7f9e56c8e0a8bf85_.log (deflated 87%) 2025-08-14T23:00:21.2128088Z adding: test/test-reports/inductor.test_mkldnn_pattern_matcher_1.1_2c5b86ab1437d1d7_.log (deflated 96%) 2025-08-14T23:00:21.2128949Z adding: test/test-reports/inductor.test_triton_wrapper_1.1_c30d1c70a30ca772_.log (deflated 51%) 2025-08-14T23:00:21.2129998Z adding: test/test-reports/inductor.test_coordinate_descent_tuner_1.1_aeadd411e4ddfc23_.log (deflated 68%) 2025-08-14T23:00:21.2130950Z adding: test/test-reports/inductor.test_select_algorithm_1.1_c5ea25282a63e0a3_.log (deflated 82%) 2025-08-14T23:00:21.2131925Z adding: test/test-reports/inductor.test_triton_extension_backend_1.1_845d01773facf697_.log (deflated 51%) 2025-08-14T23:00:21.2133063Z adding: test/test-reports/inductor.test_autoheuristic_1.1_3f14a2c1f52c5c25_.log (deflated 50%) 2025-08-14T23:00:21.2133879Z adding: test/test-reports/dynamo.test_precompile_context_1.1_87ab36be47245142_.log (deflated 60%) 2025-08-14T23:00:21.2134705Z adding: test/test-reports/dynamo.test_cudagraphs_expandable_segments_1.1_c9e3e333f5234bed_.log (deflated 74%) 2025-08-14T23:00:21.2140336Z adding: test/test-reports/test_sparse_semi_structured_1.1_53126d24062a7223_.log (deflated 94%) 2025-08-14T23:00:21.2141358Z adding: test/test-reports/dynamo.test_structured_trace_1.1_9a2f3ac74641179f_.log (deflated 82%) 2025-08-14T23:00:21.2147847Z adding: test/test-reports/inductor.test_cudagraph_trees_expandable_segments_1.1_950b494803e8b65e_.log (deflated 93%) 2025-08-14T23:00:21.2148877Z adding: test/test-reports/test_functionalization_of_rng_ops_1.1_b47c46664637f645_.log (deflated 77%) 2025-08-14T23:00:21.2159702Z adding: test/test-reports/functorch.test_eager_transforms_1.1_9cee15fb557e4f13_.log (deflated 93%) 2025-08-14T23:00:21.2160565Z adding: test/test-reports/dynamo.test_python_autograd_1.1_722ac34e8dc2525e_.log (deflated 66%) 2025-08-14T23:00:21.2170092Z adding: test/test-reports/inductor.test_torchinductor_opinfo_6.11_ea5dbff01e7c09b9_.log (deflated 94%) 2025-08-14T23:00:21.2171996Z adding: test/test-reports/higher_order_ops.test_invoke_subgraph_1.1_61913ed1c9421b98_.log (deflated 90%) 2025-08-14T23:00:21.2172963Z adding: test/test-reports/test_comparison_utils_1.1_9043bfc9d5d44193_.log (deflated 69%) 2025-08-14T23:00:21.2210299Z adding: test/test-reports/functorch.test_control_flow_1.1_78a84830e07b09aa_.log (deflated 97%) 2025-08-14T23:00:21.2211398Z adding: test/test-reports/functorch.test_ac_logging_1.1_b90373128048c882_.log (deflated 61%) 2025-08-14T23:00:21.2212588Z adding: test/test-reports/test_appending_byte_serializer_1.1_43bc8f05319ad8e6_.log (deflated 61%) 2025-08-14T23:00:21.2213660Z adding: test/test-reports/test_mkldnn_verbose_1.1_5f2ed14f9a9678a4_.log (deflated 54%) 2025-08-14T23:00:21.2214819Z adding: test/test-reports/test_functionalization_1.1_1945728fa9bb5a9b_.log (deflated 91%) 2025-08-14T23:00:21.2216045Z adding: test/test-reports/test_rename_privateuse1_to_existing_device_1.1_21d33a157b8b10f8_.log (deflated 54%) 2025-08-14T23:00:21.2217211Z adding: test/test-reports/torch_np.test_binary_ufuncs_1.1_94af6d5d3e2e4855_.log (deflated 87%) 2025-08-14T23:00:21.2223256Z adding: test/test-reports/test_expanded_weights_1.1_4975a78c4a2c19ca_.log (deflated 94%) 2025-08-14T23:00:21.2231557Z adding: test/test-reports/typing.test_python_operators_1.1_89efbe02bd836992_.log (deflated 94%) 2025-08-14T23:00:21.2250038Z adding: test/test-reports/test_fx_experimental_1.1_aa802a4a4851545d_.log (deflated 95%) 2025-08-14T23:00:21.2251423Z adding: test/test-reports/torch_np.test_dtype_1.1_1bef8a76633557a4_.log (deflated 89%) 2025-08-14T23:00:21.2252723Z adding: test/test-reports/test_native_functions_1.1_5c9875f3a132f40e_.log (deflated 75%) 2025-08-14T23:00:21.2253888Z adding: test/test-reports/functorch.test_minifier_1.1_c710732635af4530_.log (deflated 64%) 2025-08-14T23:00:21.2255029Z adding: test/test-reports/higher_order_ops.test_with_effects_1.1_216deb300ab15c33_.log (deflated 77%) 2025-08-14T23:00:21.2269672Z adding: test/test-reports/inductor.test_compiled_autograd_1.2_564c2bde9d62815d_.log (deflated 92%) 2025-08-14T23:00:21.2298600Z adding: test/test-reports/torch_np.test_nep50_examples_1.1_572621bd9f55a166_.log (deflated 96%) 2025-08-14T23:00:21.2300061Z adding: test/test-reports/higher_order_ops.test_invoke_quant_1.1_c2763aa20ec90f73_.log (deflated 79%) 2025-08-14T23:00:21.2301313Z adding: test/test-reports/test_per_overload_api_1.1_5de49593a8d9d53b_.log (deflated 59%) 2025-08-14T23:00:21.2302481Z adding: test/test-reports/test_tensorexpr_pybind_1.1_b5ffeda4e3be9d9d_.log (deflated 77%) 2025-08-14T23:00:21.2303752Z adding: test/test-reports/test_module_tracker_1.1_b53cd071b9cf6b23_.log (deflated 58%) 2025-08-14T23:00:21.2305090Z adding: test/test-reports/torch_np.numpy_tests.core.test_scalarinherit_1.1_abff485e32153282_.log (deflated 61%) 2025-08-14T23:00:21.2306560Z adding: test/test-reports/functorch.test_ac_knapsack_1.1_f5f4022801113cf5_.log (deflated 79%) 2025-08-14T23:00:21.2307827Z adding: test/test-reports/backends.xeon.test_launch_1.1_c9f7df9da8f60944_.log (deflated 53%) 2025-08-14T23:00:21.2313544Z adding: test/test-reports/torch_np.test_ufuncs_basic_1.1_97cbd1fcd97e340b_.log (deflated 96%) 2025-08-14T23:00:21.2314741Z adding: test/test-reports/test_utils_filelock_1.1_f776f9c8ed7aa0d4_.log (deflated 54%) 2025-08-14T23:00:21.2316017Z adding: test/test-reports/torch_np.test_random_1.1_87bbaeba6ffe2b3a_.log (deflated 88%) 2025-08-14T23:00:21.2317168Z adding: test/test-reports/profiler.test_kineto_1.1_be7a79ae7a1c29a8_.log (deflated 51%) 2025-08-14T23:00:21.2318365Z adding: test/test-reports/test_compile_benchmark_util_1.1_f96d6f04f6075702_.log (deflated 53%) 2025-08-14T23:00:21.2320069Z adding: test/test-reports/functorch.test_logging_1.1_8cc33e680b0486f7_.log (deflated 51%) 2025-08-14T23:00:21.2321454Z adding: test/test-reports/torch_np.numpy_tests.core.test_einsum_1.1_d527d7857f2c30ab_.log (deflated 88%) 2025-08-14T23:00:21.2322944Z adding: test/test-reports/cpp_extensions.python_agnostic_extension.test.test_python_agnostic_1.1_58a80645f153ece8_.log (deflated 78%) 2025-08-14T23:00:21.2324614Z adding: test/test-reports/lazy.test_step_closures_1.1_0800ded4aaf37df2_.log (deflated 62%) 2025-08-14T23:00:21.2325897Z adding: test/test-reports/test_fx_reinplace_pass_1.1_69b70ee01497cd87_.log (deflated 77%) 2025-08-14T23:00:21.2327146Z adding: test/test-reports/nn.test_multihead_attention_1.1_5246fe959cdeb7ed_.log (deflated 84%) 2025-08-14T23:00:21.2328618Z adding: test/test-reports/functorch.test_aot_joint_with_descriptors_1.1_4957fdb80114d9cd_.log (deflated 77%) 2025-08-14T23:00:21.2329935Z adding: test/test-reports/lazy.test_functionalization_1.1_04badc742a9adb55_.log (deflated 56%) 2025-08-14T23:00:21.2331114Z adding: test/test-reports/test_set_default_mobile_cpu_allocator_1.1_25b665f9c9fca39a_.log (deflated 58%) 2025-08-14T23:00:21.2332601Z adding: test/test-reports/test_cuda_expandable_segments_1.1_225fc33edbc4dfb9_.log (deflated 90%) 2025-08-14T23:00:21.2333787Z adding: test/test-reports/test_stateless_1.1_631f24fc5337fddb_.log (deflated 89%) 2025-08-14T23:00:21.2334931Z adding: test/test-reports/test_numpy_interop_1.1_274392e8b1c71869_.log (deflated 87%) 2025-08-14T23:00:21.2336144Z adding: test/test-reports/profiler.test_record_function_1.1_641f4896a74536a5_.log (deflated 65%) 2025-08-14T23:00:21.2337357Z adding: test/test-reports/test_type_hints_1.1_109e02351f85f1b1_.log (deflated 50%) 2025-08-14T23:00:21.2338776Z adding: test/test-reports/nn.test_lazy_modules_1.1_9e32cfe3322535bd_.log (deflated 89%) 2025-08-14T23:00:21.2340027Z adding: test/test-reports/test_segment_reductions_1.1_e225824b354b0ebd_.log (deflated 93%) 2025-08-14T23:00:21.2341239Z adding: test/test-reports/test_import_stats_1.1_b5a9242fb437788c_.log (deflated 54%) 2025-08-14T23:00:21.2345580Z adding: test/test-reports/test_masked_1.1_710b562ab5663766_.log (deflated 94%) 2025-08-14T23:00:21.2416562Z adding: test/test-reports/test_ops_fwd_gradients_1.1_6bdac248fc67747e_.log (deflated 96%) 2025-08-14T23:00:21.2417362Z adding: test/test-reports/test_cuda_sanitizer_1.1_18743aaa2f74fa9a_.log (deflated 79%) 2025-08-14T23:00:21.2418192Z adding: test/test-reports/test_monitor_1.1_afb69103250d24ce_.log (deflated 62%) 2025-08-14T23:00:21.2419555Z adding: test/test-reports/profiler.test_cpp_thread_1.1_b9ccb5213e60c531_.log (deflated 80%) 2025-08-14T23:00:21.2422648Z adding: test/test-reports/test_subclass_1.1_09bfb567f3230108_.log (deflated 92%) 2025-08-14T23:00:21.2425104Z adding: test/test-reports/profiler.test_profiler_1.1_6efd2c1406cb25f8_.log (deflated 87%) 2025-08-14T23:00:21.2470149Z adding: test/test-reports/functorch.test_vmap_registrations_1.1_7566e4ce4b5bbc84_.log (deflated 96%) 2025-08-14T23:00:21.2472010Z adding: test/test-reports/nn.test_parametrization_1.1_ef1846533e323439_.log (deflated 90%) 2025-08-14T23:00:21.2473142Z adding: test/test-reports/nn.test_pruning_1.1_805917eaffb3b67f_.log (deflated 84%) 2025-08-14T23:00:21.2475483Z adding: test/test-reports/functorch.test_dims_1.1_e5c9ef7d186282ed_.log (deflated 88%) 2025-08-14T23:00:21.2490381Z adding: test/test-reports/functorch.test_aotdispatch_1.1_abfc07815adbe635_.log (deflated 94%) 2025-08-14T23:00:21.2492975Z adding: test/test-reports/test_datapipe_1.1_8a41c5819fb055b3_.log (deflated 88%) 2025-08-14T23:00:21.2615923Z adding: test/test-reports/test_schema_check_1.1_704d08f6f136a409_.log (deflated 97%) 2025-08-14T23:00:21.2616719Z adding: test/test-reports/test_bundled_inputs_1.1_89ad33cc94e1b2c2_.log (deflated 75%) 2025-08-14T23:00:21.2617672Z adding: test/test-reports/profiler.test_execution_trace_1.1_a62eea27030c7046_.log (deflated 80%) 2025-08-14T23:00:21.2618624Z adding: test/test-reports/lazy.test_generator_1.1_84760bf3b446f478_.log (deflated 55%) 2025-08-14T23:00:21.2621437Z adding: test/test-reports/torch_np.numpy_tests.core.test_indexing_1.1_1ede8d91f764b463_.log (deflated 88%) 2025-08-14T23:00:21.2639832Z adding: test/test-reports/test_maskedtensor_1.1_c46245c59626dad0_.log (deflated 96%) 2025-08-14T23:00:21.2641339Z adding: test/test-reports/torch_np.numpy_tests.lib.test_type_check_1.1_01c7801ce54ecfce_.log (deflated 88%) 2025-08-14T23:00:21.2642280Z adding: test/test-reports/lazy.test_ts_opinfo_1.1_447e9088e9ec6a5f_.log (deflated 62%) 2025-08-14T23:00:21.2643952Z adding: test/test-reports/test_tensorboard_1.1_42d64527d9396035_.log (deflated 86%) 2025-08-14T23:00:21.2645083Z adding: test/test-reports/test_mkldnn_fusion_1.1_f97910fb5be574f5_.log (deflated 67%) 2025-08-14T23:00:21.2645842Z adding: test/test-reports/optim.test_swa_utils_1.1_0f26b337401dba43_.log (deflated 36%) 2025-08-14T23:00:21.2646602Z adding: test/test-reports/benchmark_utils.test_benchmark_utils_1.1_60a5ea44053c058e_.log (deflated 71%) 2025-08-14T23:00:21.2701325Z adding: test/test-reports/test_sparse_csr_2.2_34fd65573e1f2227_.log (deflated 97%) 2025-08-14T23:00:21.2712975Z adding: test/test-reports/torch_np.numpy_tests.core.test_multiarray_1.2_47a564619079a337_.log (deflated 92%) 2025-08-14T23:00:21.2827485Z ##[group]Run # Remove any previous debugging artifacts if they exist 2025-08-14T23:00:21.2828070Z # Remove any previous debugging artifacts if they exist 2025-08-14T23:00:21.2828652Z rm -f debug-*.zip 2025-08-14T23:00:21.2828986Z if [ -d 'test/debug' ]; then 2025-08-14T23:00:21.2829466Z  zip -r "debug-${FILE_SUFFIX}.zip" test/debug 2025-08-14T23:00:21.2839698Z fi 2025-08-14T23:00:21.2849130Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T23:00:21.2849480Z env: 2025-08-14T23:00:21.2849685Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:21.2850006Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:21.2850534Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:21.2850992Z DEVICE_NAME: 2025-08-14T23:00:21.2851210Z DEVICE_TYPE: 2025-08-14T23:00:21.2851544Z FILE_SUFFIX: test-slow-3-3-linux.g5.4xlarge.nvidia.gpu_48128162047 2025-08-14T23:00:21.2851924Z ##[endgroup] 2025-08-14T23:00:21.3068311Z ##[group]Run seemethere/upload-artifact-s3@v5 2025-08-14T23:00:21.3068631Z with: 2025-08-14T23:00:21.3068857Z s3-bucket: gha-artifacts 2025-08-14T23:00:21.3069173Z s3-prefix: pytorch/pytorch/16976255024/1/artifact 2025-08-14T23:00:21.3069515Z retention-days: 14 2025-08-14T23:00:21.3069763Z if-no-files-found: warn 2025-08-14T23:00:21.3070048Z path: test-jsons-*.zip 2025-08-14T23:00:21.3070305Z name: artifact 2025-08-14T23:00:21.3070530Z region: us-east-1 2025-08-14T23:00:21.3070757Z env: 2025-08-14T23:00:21.3070975Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:21.3071299Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:21.3071837Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:21.3072316Z DEVICE_NAME: 2025-08-14T23:00:21.3072536Z DEVICE_TYPE: 2025-08-14T23:00:21.3072759Z ##[endgroup] 2025-08-14T23:00:21.9427973Z NOTE: s3-prefix specified, ignoring name parameter 2025-08-14T23:00:21.9428389Z With the provided path, there will be 1 file uploaded 2025-08-14T23:00:21.9428811Z Uploading to s3 prefix: pytorch/pytorch/16976255024/1/artifact 2025-08-14T23:00:21.9501581Z Starting upload of test-jsons-test-slow-3-3-linux.g5.4xlarge.nvidia.gpu_48128162047.zip 2025-08-14T23:00:22.0718122Z Finished upload of test-jsons-test-slow-3-3-linux.g5.4xlarge.nvidia.gpu_48128162047.zip 2025-08-14T23:00:22.1090544Z ##[group]Run seemethere/upload-artifact-s3@v5 2025-08-14T23:00:22.1090867Z with: 2025-08-14T23:00:22.1091089Z s3-bucket: gha-artifacts 2025-08-14T23:00:22.1091403Z s3-prefix: pytorch/pytorch/16976255024/1/artifact 2025-08-14T23:00:22.1091739Z retention-days: 14 2025-08-14T23:00:22.1092191Z if-no-files-found: error 2025-08-14T23:00:22.1092559Z path: test-reports-*.zip 2025-08-14T23:00:22.1092813Z name: artifact 2025-08-14T23:00:22.1093040Z region: us-east-1 2025-08-14T23:00:22.1093263Z env: 2025-08-14T23:00:22.1093467Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:22.1093794Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:22.1094322Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:22.1094789Z DEVICE_NAME: 2025-08-14T23:00:22.1095004Z DEVICE_TYPE: 2025-08-14T23:00:22.1095221Z ##[endgroup] 2025-08-14T23:00:22.4465337Z NOTE: s3-prefix specified, ignoring name parameter 2025-08-14T23:00:22.4465821Z With the provided path, there will be 1 file uploaded 2025-08-14T23:00:22.4466247Z Uploading to s3 prefix: pytorch/pytorch/16976255024/1/artifact 2025-08-14T23:00:22.4537137Z Starting upload of test-reports-test-slow-3-3-linux.g5.4xlarge.nvidia.gpu_48128162047.zip 2025-08-14T23:00:22.6403415Z Finished upload of test-reports-test-slow-3-3-linux.g5.4xlarge.nvidia.gpu_48128162047.zip 2025-08-14T23:00:22.6769339Z ##[group]Run seemethere/upload-artifact-s3@v5 2025-08-14T23:00:22.6769653Z with: 2025-08-14T23:00:22.6769872Z s3-bucket: gha-artifacts 2025-08-14T23:00:22.6770193Z s3-prefix: pytorch/pytorch/16976255024/1/artifact 2025-08-14T23:00:22.6770519Z retention-days: 14 2025-08-14T23:00:22.6770776Z if-no-files-found: ignore 2025-08-14T23:00:22.6771041Z path: logs-*.zip 2025-08-14T23:00:22.6771263Z name: artifact 2025-08-14T23:00:22.6771487Z region: us-east-1 2025-08-14T23:00:22.6771710Z env: 2025-08-14T23:00:22.6771912Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:22.6772373Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:22.6772899Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:22.6773369Z DEVICE_NAME: 2025-08-14T23:00:22.6773611Z DEVICE_TYPE: 2025-08-14T23:00:22.6773923Z ##[endgroup] 2025-08-14T23:00:23.0095485Z NOTE: s3-prefix specified, ignoring name parameter 2025-08-14T23:00:23.0096051Z With the provided path, there will be 1 file uploaded 2025-08-14T23:00:23.0096477Z Uploading to s3 prefix: pytorch/pytorch/16976255024/1/artifact 2025-08-14T23:00:23.0168560Z Starting upload of logs-test-slow-3-3-linux.g5.4xlarge.nvidia.gpu_48128162047.zip 2025-08-14T23:00:23.2178991Z Finished upload of logs-test-slow-3-3-linux.g5.4xlarge.nvidia.gpu_48128162047.zip 2025-08-14T23:00:23.2559718Z ##[group]Run seemethere/upload-artifact-s3@v5 2025-08-14T23:00:23.2560038Z with: 2025-08-14T23:00:23.2560259Z s3-bucket: gha-artifacts 2025-08-14T23:00:23.2560576Z s3-prefix: pytorch/pytorch/16976255024/1/artifact 2025-08-14T23:00:23.2560923Z retention-days: 14 2025-08-14T23:00:23.2561171Z if-no-files-found: ignore 2025-08-14T23:00:23.2561442Z path: debug-*.zip 2025-08-14T23:00:23.2561670Z name: artifact 2025-08-14T23:00:23.2561889Z region: us-east-1 2025-08-14T23:00:23.2562106Z env: 2025-08-14T23:00:23.2562327Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:23.2562655Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:23.2563178Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:23.2563643Z DEVICE_NAME: 2025-08-14T23:00:23.2563858Z DEVICE_TYPE: 2025-08-14T23:00:23.2564075Z ##[endgroup] 2025-08-14T23:00:23.5908449Z No files were found with the provided path: debug-*.zip. No artifacts will be uploaded. 2025-08-14T23:00:23.6262008Z ##[group]Run # shellcheck disable=SC2156 2025-08-14T23:00:23.6262379Z # shellcheck disable=SC2156 2025-08-14T23:00:23.6262917Z find . -iname "core.[1-9]*" -exec docker exec "${DOCKER_CONTAINER_ID}" sh -c "gdb python {} -ex 'bt' -ex 'q'" \; 2025-08-14T23:00:23.6272204Z shell: /usr/bin/bash -e {0} 2025-08-14T23:00:23.6272459Z env: 2025-08-14T23:00:23.6272672Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:23.6272999Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:23.6273517Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:23.6274104Z DEVICE_NAME: 2025-08-14T23:00:23.6274322Z DEVICE_TYPE: 2025-08-14T23:00:23.6274548Z ##[endgroup] 2025-08-14T23:00:24.0621236Z Prepare all required actions 2025-08-14T23:00:24.0621600Z Getting action download info 2025-08-14T23:00:24.1702064Z ##[group]Run ./.github/actions/upload-utilization-stats 2025-08-14T23:00:24.1702400Z with: 2025-08-14T23:00:24.1702590Z job_id: 48128162047 2025-08-14T23:00:24.1703038Z job_name: linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 3, 3, linux.g5.4xlarge.nvidia.gpu) 2025-08-14T23:00:24.1703535Z workflow_name: slow 2025-08-14T23:00:24.1703774Z workflow_run_id: 16976255024 2025-08-14T23:00:24.1704029Z workflow_attempt: 1 2025-08-14T23:00:24.1704249Z env: 2025-08-14T23:00:24.1704449Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:24.1704756Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:24.1705269Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:24.1705727Z DEVICE_NAME: 2025-08-14T23:00:24.1705932Z DEVICE_TYPE: 2025-08-14T23:00:24.1706141Z ##[endgroup] 2025-08-14T23:00:24.1776564Z ##[group]Run echo "workflow_id: 16976255024" 2025-08-14T23:00:24.1776907Z echo "workflow_id: 16976255024" 2025-08-14T23:00:24.1777210Z echo "workflow_attempt: 1" 2025-08-14T23:00:24.1777509Z echo "workflow_Name: slow" 2025-08-14T23:00:24.1777811Z echo "job_id: 48128162047" 2025-08-14T23:00:24.1778366Z echo "job_name: linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 3, 3, linux.g5.4xlarge.nvidia.gpu)" 2025-08-14T23:00:24.1779025Z echo "artifact_prefix: " 2025-08-14T23:00:24.1779319Z python3 --version 2025-08-14T23:00:24.1788297Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T23:00:24.1788640Z env: 2025-08-14T23:00:24.1788850Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:24.1789169Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:24.1789680Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:24.1790144Z DEVICE_NAME: 2025-08-14T23:00:24.1790374Z DEVICE_TYPE: 2025-08-14T23:00:24.1790586Z ##[endgroup] 2025-08-14T23:00:24.1826475Z workflow_id: 16976255024 2025-08-14T23:00:24.1826735Z workflow_attempt: 1 2025-08-14T23:00:24.1827386Z workflow_Name: slow 2025-08-14T23:00:24.1827666Z job_id: 48128162047 2025-08-14T23:00:24.1828115Z job_name: linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 3, 3, linux.g5.4xlarge.nvidia.gpu) 2025-08-14T23:00:24.1828619Z artifact_prefix: 2025-08-14T23:00:24.1843411Z Python 3.9.23 2025-08-14T23:00:24.1934830Z ##[group]Run nick-fields/retry@v3.0.0 2025-08-14T23:00:24.1935125Z with: 2025-08-14T23:00:24.1935335Z shell: bash 2025-08-14T23:00:24.1935550Z timeout_minutes: 5 2025-08-14T23:00:24.1935789Z max_attempts: 5 2025-08-14T23:00:24.1936023Z retry_wait_seconds: 30 2025-08-14T23:00:24.1936541Z command: set -eu python3 -m pip install python-dateutil==2.8.2 boto3==1.35.42 pandas==2.1.3 dataclasses_json==0.6.7 2025-08-14T23:00:24.1937085Z polling_interval_seconds: 1 2025-08-14T23:00:24.1937399Z warning_on_retry: true 2025-08-14T23:00:24.1937686Z continue_on_error: false 2025-08-14T23:00:24.1937933Z env: 2025-08-14T23:00:24.1938142Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:24.1938462Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:24.1938974Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:24.1939432Z DEVICE_NAME: 2025-08-14T23:00:24.1939658Z DEVICE_TYPE: 2025-08-14T23:00:24.1939869Z ##[endgroup] 2025-08-14T23:00:24.5717201Z Defaulting to user installation because normal site-packages is not writeable 2025-08-14T23:00:24.6735698Z Collecting python-dateutil==2.8.2 2025-08-14T23:00:24.6902730Z Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB) 2025-08-14T23:00:25.8622102Z Collecting boto3==1.35.42 2025-08-14T23:00:25.8657189Z Downloading boto3-1.35.42-py3-none-any.whl (139 kB) 2025-08-14T23:00:26.4995904Z Collecting pandas==2.1.3 2025-08-14T23:00:26.5035317Z Downloading pandas-2.1.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.3 MB) 2025-08-14T23:00:26.8151946Z Requirement already satisfied: dataclasses_json==0.6.7 in /home/ec2-user/.local/lib/python3.9/site-packages (0.6.7) 2025-08-14T23:00:26.8167595Z Requirement already satisfied: six>=1.5 in /usr/lib/python3.9/site-packages (from python-dateutil==2.8.2) (1.15.0) 2025-08-14T23:00:26.8213147Z Requirement already satisfied: s3transfer<0.11.0,>=0.10.0 in /home/ec2-user/.local/lib/python3.9/site-packages (from boto3==1.35.42) (0.10.4) 2025-08-14T23:00:26.8218167Z Requirement already satisfied: botocore<1.36.0,>=1.35.42 in /home/ec2-user/.local/lib/python3.9/site-packages (from boto3==1.35.42) (1.35.99) 2025-08-14T23:00:26.8221804Z Requirement already satisfied: jmespath<2.0.0,>=0.7.1 in /usr/lib/python3.9/site-packages (from boto3==1.35.42) (0.10.0) 2025-08-14T23:00:27.8653763Z Collecting numpy<2,>=1.22.4 2025-08-14T23:00:27.8692883Z Downloading numpy-1.26.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB) 2025-08-14T23:00:28.2883280Z Requirement already satisfied: pytz>=2020.1 in /usr/lib/python3.9/site-packages (from pandas==2.1.3) (2022.7.1) 2025-08-14T23:00:28.3538446Z Collecting tzdata>=2022.1 2025-08-14T23:00:28.3574300Z Downloading tzdata-2025.2-py2.py3-none-any.whl (347 kB) 2025-08-14T23:00:28.3956759Z Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in /home/ec2-user/.local/lib/python3.9/site-packages (from dataclasses_json==0.6.7) (3.26.1) 2025-08-14T23:00:28.3959566Z Requirement already satisfied: typing-inspect<1,>=0.4.0 in /home/ec2-user/.local/lib/python3.9/site-packages (from dataclasses_json==0.6.7) (0.9.0) 2025-08-14T23:00:28.4059026Z Requirement already satisfied: urllib3<1.27,>=1.25.4 in /usr/lib/python3.9/site-packages (from botocore<1.36.0,>=1.35.42->boto3==1.35.42) (1.25.10) 2025-08-14T23:00:28.4157997Z Requirement already satisfied: packaging>=17.0 in /home/ec2-user/.local/lib/python3.9/site-packages (from marshmallow<4.0.0,>=3.18.0->dataclasses_json==0.6.7) (25.0) 2025-08-14T23:00:28.4264402Z Requirement already satisfied: typing-extensions>=3.7.4 in /home/ec2-user/.local/lib/python3.9/site-packages (from typing-inspect<1,>=0.4.0->dataclasses_json==0.6.7) (4.14.1) 2025-08-14T23:00:28.4268208Z Requirement already satisfied: mypy-extensions>=0.3.0 in /home/ec2-user/.local/lib/python3.9/site-packages (from typing-inspect<1,>=0.4.0->dataclasses_json==0.6.7) (1.1.0) 2025-08-14T23:00:28.7607964Z Installing collected packages: python-dateutil, tzdata, numpy, pandas, boto3 2025-08-14T23:00:34.2902831Z Attempting uninstall: boto3 2025-08-14T23:00:34.2905226Z Found existing installation: boto3 1.35.33 2025-08-14T23:00:34.3035861Z Uninstalling boto3-1.35.33: 2025-08-14T23:00:34.3053528Z Successfully uninstalled boto3-1.35.33 2025-08-14T23:00:34.4238123Z Successfully installed boto3-1.35.42 numpy-1.26.4 pandas-2.1.3 python-dateutil-2.8.2 tzdata-2025.2 2025-08-14T23:00:35.2827262Z Command completed after 1 attempt(s). 2025-08-14T23:00:35.2975244Z ##[group]Run python3 -m tools.stats.upload_utilization_stats.upload_utilization_stats \ 2025-08-14T23:00:35.2975891Z python3 -m tools.stats.upload_utilization_stats.upload_utilization_stats \ 2025-08-14T23:00:35.2976359Z  --workflow-run-id "16976255024" \ 2025-08-14T23:00:35.2976684Z  --workflow-name "slow" \ 2025-08-14T23:00:35.2977005Z  --workflow-run-attempt "1" \ 2025-08-14T23:00:35.2977315Z  --job-id "48128162047" \ 2025-08-14T23:00:35.2977842Z  --job-name "linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 3, 3, linux.g5.4xlarge.nvidia.gpu)" \ 2025-08-14T23:00:35.2978386Z  --local-path "" \ 2025-08-14T23:00:35.2978673Z  --artifact-prefix "" 2025-08-14T23:00:35.2987699Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T23:00:35.2988050Z env: 2025-08-14T23:00:35.2988267Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:35.2988582Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:35.2989323Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:35.2989791Z DEVICE_NAME: 2025-08-14T23:00:35.2990018Z DEVICE_TYPE: 2025-08-14T23:00:35.2990233Z ##[endgroup] 2025-08-14T23:00:38.5974947Z repo: pytorch/pytorch 2025-08-14T23:00:38.5975316Z Search for test log in s3 bucket: ossci-utilization 2025-08-14T23:00:38.5975814Z Downloading logs-test-slow-3-3-linux.g5.4xlarge.nvidia.gpu_48128162047.zip 2025-08-14T23:00:38.5976481Z extracting usage_log.txt from zip file logs-test-slow-3-3-linux.g5.4xlarge.nvidia.gpu_48128162047.zip 2025-08-14T23:00:38.5977021Z Converted Log Model: UtilizationMetadata: 2025-08-14T23:00:38.5978291Z UtilizationMetadata(level='metadata', workflow_id='16976255024', job_id='48128162047', workflow_name='slow', job_name='linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 3, 3, linux.g5.4xlarge.nvidia.gpu)', usage_collect_interval=1.0, data_model_version=1.5, start_at=1755206874, gpu_count=1, cpu_count=16, gpu_type='pynvml', error=None) 2025-08-14T23:00:38.5979613Z [Db Segments] detected pytest cmd: 12, generated segments: 12 2025-08-14T23:00:38.5979998Z [db model] Peek db timeseries 2025-08-14T23:00:38.5980252Z :{ 2025-08-14T23:00:38.5980455Z "created_at": 1755212437, 2025-08-14T23:00:38.5980715Z "type": "utilization", 2025-08-14T23:00:38.5981213Z "tags": [ 2025-08-14T23:00:38.5981415Z "record" 2025-08-14T23:00:38.5981613Z ], 2025-08-14T23:00:38.5981910Z "time_stamp": 1755206874, 2025-08-14T23:00:38.5982177Z "repo": "pytorch/pytorch", 2025-08-14T23:00:38.5982443Z "workflow_id": 16976255024, 2025-08-14T23:00:38.5982744Z "run_attempt": 1, 2025-08-14T23:00:38.5982982Z "job_id": 48128162047, 2025-08-14T23:00:38.5983225Z "workflow_name": "slow", 2025-08-14T23:00:38.5983699Z "job_name": "linux-jammy-cuda12.8-py3.10-gcc11-sm86 / test (slow, 3, 3, linux.g5.4xlarge.nvidia.gpu)", 2025-08-14T23:00:38.5984193Z "json_data": "{}" 2025-08-14T23:00:38.5984407Z } 2025-08-14T23:00:38.5984865Z Writing 1 documents to S3 ossci-utilization/util_metadata/v_1.5/pytorch/pytorch/16976255024/1/48128162047/metadata 2025-08-14T23:00:38.5985688Z Done! Finish writing document to S3 ossci-utilization/util_metadata/v_1.5/pytorch/pytorch/16976255024/1/48128162047/metadata 2025-08-14T23:00:38.5986537Z Writing 1107 documents to S3 ossci-utilization/util_timeseries/v_1.5/pytorch/pytorch/16976255024/1/48128162047/time_series 2025-08-14T23:00:38.5987409Z Done! Finish writing document to S3 ossci-utilization/util_timeseries/v_1.5/pytorch/pytorch/16976255024/1/48128162047/time_series 2025-08-14T23:00:38.7223594Z ##[group]Run pytorch/test-infra/.github/actions/teardown-linux@main 2025-08-14T23:00:38.7224016Z with: 2025-08-14T23:00:38.7224223Z env: 2025-08-14T23:00:38.7224441Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:38.7224770Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:38.7225307Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:38.7225792Z DEVICE_NAME: 2025-08-14T23:00:38.7226017Z DEVICE_TYPE: 2025-08-14T23:00:38.7226239Z ##[endgroup] 2025-08-14T23:00:38.7320632Z ##[group]Run set -eou pipefail 2025-08-14T23:00:38.7321120Z set -eou pipefail 2025-08-14T23:00:38.7321374Z  2025-08-14T23:00:38.7321732Z echo "Holding runner for 2 hours until all ssh sessions have logged out" 2025-08-14T23:00:38.7322229Z for _ in $(seq 1440); do 2025-08-14T23:00:38.7322557Z  # Break if no ssh session exists anymore 2025-08-14T23:00:38.7322919Z  if [ "$(who)" = "" ]; then 2025-08-14T23:00:38.7323202Z  break 2025-08-14T23:00:38.7323438Z  fi 2025-08-14T23:00:38.7323666Z  echo "." 2025-08-14T23:00:38.7323898Z  sleep 5 2025-08-14T23:00:38.7324131Z done 2025-08-14T23:00:38.7333153Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T23:00:38.7333502Z env: 2025-08-14T23:00:38.7333720Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:38.7334048Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:38.7334564Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:38.7335036Z DEVICE_NAME: 2025-08-14T23:00:38.7335263Z DEVICE_TYPE: 2025-08-14T23:00:38.7335490Z ##[endgroup] 2025-08-14T23:00:38.7366482Z Holding runner for 2 hours until all ssh sessions have logged out 2025-08-14T23:00:38.7812382Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2025-08-14T23:00:38.7812966Z # ignore expansion of "docker ps -q" since it could be empty 2025-08-14T23:00:38.7813369Z # shellcheck disable=SC2046 2025-08-14T23:00:38.7813690Z docker stop $(docker ps -q) || true 2025-08-14T23:00:38.7814015Z # Prune all of the docker images 2025-08-14T23:00:38.7814330Z docker system prune -af 2025-08-14T23:00:38.7823233Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T23:00:38.7823577Z env: 2025-08-14T23:00:38.7823790Z GIT_DEFAULT_BRANCH: main 2025-08-14T23:00:38.7824113Z GPU_FLAG: --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all 2025-08-14T23:00:38.7824622Z DOCKER_CONTAINER_ID: 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:38.7825083Z DEVICE_NAME: 2025-08-14T23:00:38.7825430Z DEVICE_TYPE: 2025-08-14T23:00:38.7825651Z ##[endgroup] 2025-08-14T23:00:50.1893682Z 04bb4a0f273f 2025-08-14T23:00:53.1073990Z Deleted Containers: 2025-08-14T23:00:53.1074517Z 04bb4a0f273f7f691d330baac8a978ebf0131a8c90321964f23501f7bd32799c 2025-08-14T23:00:53.1074832Z 2025-08-14T23:01:06.6695094Z Deleted Images: 2025-08-14T23:01:06.6696123Z untagged: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc11-bfa89110622ba7202628e9faac705f183070defe 2025-08-14T23:01:06.6697524Z untagged: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image@sha256:2912f52318e06e9a551538b828fea15def013f278443bb1f452fdd9a824ccbdf 2025-08-14T23:01:06.6698551Z deleted: sha256:9164ab1496895ac66808d8e6566acf1991fc744d321a8ac71d8dbd8989ac57a8 2025-08-14T23:01:06.6699150Z deleted: sha256:357ff02abb12b4e6dc62a9807cdcf90210163c9eb5b79ebcb8087a136c7a9624 2025-08-14T23:01:06.6699745Z deleted: sha256:bc8072040a821c95db9caee5922e15c3b33ff8e730a1bd6b2bd369575bc099dd 2025-08-14T23:01:06.6700368Z deleted: sha256:1c7d31d07c9c16c3e517ff3de02f6caa8a616ada40246d14459273a8c4bb6fc9 2025-08-14T23:01:06.6700972Z deleted: sha256:3aa634e29657dc4a8ec63b6e8f82e7fb2dc3b8236795618c4d7515dd1deef041 2025-08-14T23:01:06.6701935Z deleted: sha256:6b5aaba3359fe49963785f8eea8d32477cae8dfeb55f1f70799050b652e428cb 2025-08-14T23:01:06.6702537Z deleted: sha256:52965f454f6db1fa162de2a592649f2dd7657f1af5a78f0368d6d8a81948caad 2025-08-14T23:01:06.6703351Z deleted: sha256:a72815db8a9f6bd0058a29899376aa189f8fac03fede406748f6132393a946fe 2025-08-14T23:01:06.6703942Z deleted: sha256:7df8d373d9c639786fbea841e16860d7ab5410a26d1be7d6d46d883ce232d885 2025-08-14T23:01:06.6704525Z deleted: sha256:b86b653ea43c8d86e265ae4a5a89602d06b3d116c9987a72c9edc8c78e054f44 2025-08-14T23:01:06.6705117Z deleted: sha256:b54f9311e58cf7c18441cb9a177ca192f532c770c8eb0cdffe4c3661696e5214 2025-08-14T23:01:06.6705721Z deleted: sha256:e1fafa3baff3004ce1075fcd1bd6b3304d2067a8c69ed9c01cc0d99c58193ead 2025-08-14T23:01:06.6706486Z deleted: sha256:251f9b3f14df6e55a4ddb5b46d94257349736d240f14e07a4e7d569ec8e0de69 2025-08-14T23:01:06.6707065Z deleted: sha256:8c0537b4216685d2c318b8a07427bb51f717ba183a02340d497d477e301a579e 2025-08-14T23:01:06.6707654Z deleted: sha256:08a1136630587b38a8f2b64b02eb9b2cfe793dc78175e4761114e8db6e6a5a1a 2025-08-14T23:01:06.6708228Z deleted: sha256:0d77c5964995450909521a8177c5d07cba56e9f027217493f59576f8acdf28b1 2025-08-14T23:01:06.6708806Z deleted: sha256:d131fc8298dd130d3c44f639aad083b5c3c833fb95b8e86d16c3a86a53fed7b8 2025-08-14T23:01:06.6709433Z deleted: sha256:3ef9a13c3441c05d89bf04c2622680518963accdff757c3133768a5f36d44a59 2025-08-14T23:01:06.6710020Z deleted: sha256:423732cafd95507f1c8b6960dc321e9121758bd19c07ecaff796764c48f812ac 2025-08-14T23:01:06.6710608Z deleted: sha256:1250580e582a8082c9a7fcb1029f6979bdf5e62b017e8eb90d03ab7c43ed2d14 2025-08-14T23:01:06.6711216Z deleted: sha256:f677bc9f421dc3dfdd6de7492eca1ceb7bddaf8206af432fabe34cfe251543db 2025-08-14T23:01:06.6711813Z deleted: sha256:941a968d33883c191c0486587c764bf20712eb2bb2b03b09cce532670ed186fc 2025-08-14T23:01:06.6712391Z deleted: sha256:043db13566a8807d244bc594d977ea9f7383f6586a32ba040f7033e13ef2e48e 2025-08-14T23:01:06.6713123Z deleted: sha256:b81bb33469df700566354e7cda423ae6ce8e45ccc4136d7eec4a34b88db77bb4 2025-08-14T23:01:06.6713711Z deleted: sha256:d609d8649c6555c1c52d63c71ec4e296f006091df8914232c34fc4f58696deb3 2025-08-14T23:01:06.6714297Z deleted: sha256:7e639d8f34fe9403251c693cf86b26e8af665520cab9f3ab35f6b180c0e3e0fc 2025-08-14T23:01:06.6714892Z deleted: sha256:6b4bd9b4561fcd2bf20695e910f9f65af5d7303f561e72882ffe032e1aff0773 2025-08-14T23:01:06.6715481Z deleted: sha256:a6559f005499028fc02e209c2f31d518c9cad5628cc6f70406a10e3f0eb5ed1c 2025-08-14T23:01:06.6716062Z deleted: sha256:7768299b47134bfc194f59a1d520bac10faa756471f99110acbefe84e2f7666c 2025-08-14T23:01:06.6716649Z deleted: sha256:326e363550576f3ec8bcbe5195202103f16ea83e93634f5fb68bb83e47dfe7a1 2025-08-14T23:01:06.6717367Z deleted: sha256:9de7bdc015963ab859edac95675efae5de5f1e60f43e5e95f34b473838fbee6d 2025-08-14T23:01:06.6718099Z deleted: sha256:90c113e07464702caf40af63d3b7e132bf51571daad662d61c00bc8349369278 2025-08-14T23:01:06.6718773Z deleted: sha256:6f30157a83bf9b2a0260ce033ecd246efd4cc657b0b8e53872ff7861f8d38987 2025-08-14T23:01:06.6719363Z deleted: sha256:81e7b88efb01da657e39885410ae5a132234420c35bc995b48f1c44d99fd71d0 2025-08-14T23:01:06.6719955Z deleted: sha256:cbce4e380f012b10eea8fa17d0e28b2bf253add50da73b37ee2dfac09a8a9def 2025-08-14T23:01:06.6720545Z deleted: sha256:0c78152adfb28a188e08149441e9d912177e60de96dfff2943eff6bcb5229a5f 2025-08-14T23:01:06.6721250Z deleted: sha256:6e68f891a529517a27ce45f91386b63bc09b052326bd8127de697ab758ffcef0 2025-08-14T23:01:06.6721825Z deleted: sha256:be5492b93577145112829a8a046abad72ff2b7f794f17bba79a7336bcb1ee03b 2025-08-14T23:01:06.6722414Z deleted: sha256:468dd46a480a1b6e90fef6c9db146d698a2a56b2740a3dfcc948dadeecb1fefd 2025-08-14T23:01:06.6722997Z deleted: sha256:f454c4d92c94085bdfc2276f9706fa9f300729fff5d4b8bf58898ff9c21bfac2 2025-08-14T23:01:06.6723588Z deleted: sha256:af7c8151ef78814e22d809784cb172bc379dd0646251f3cb0bb7ebc77c4d0887 2025-08-14T23:01:06.6724178Z deleted: sha256:1cb2380680827d1cb491ca554620ec52cf4effb30a7da319c4f372c849606b4e 2025-08-14T23:01:06.6725128Z deleted: sha256:1e2535eb1ee959cc6d7b38b6bc2a7c10193aad069d880e97d208747ecbda4ec8 2025-08-14T23:01:06.6725729Z deleted: sha256:92ee6bfef40fead7a3433ef26386918a3163fce94aac92fb1e4cb71f05320abe 2025-08-14T23:01:06.6726316Z deleted: sha256:1f02fe855c3033a1b412a10757d028155895da7e83edfe0a141442866b5fc0ad 2025-08-14T23:01:06.6726892Z deleted: sha256:25f6a62d948c2c0aea8bb2f2cf7111436b5a02c74b75d3db781853676e506189 2025-08-14T23:01:06.6727469Z deleted: sha256:c67df85434a1966f06d1e1236d83c80a869f453682c5da6b35a66e5f5b34e4d6 2025-08-14T23:01:06.6728051Z deleted: sha256:062e344172d4d85be1177ede59c985d949b8347b1ae1ad0b9a9bee593aa92ac4 2025-08-14T23:01:06.6728631Z deleted: sha256:51a8e276bc1a12aa6ae88e24a1c0eb2903df026872b406be6323082df98eca6c 2025-08-14T23:01:06.6729203Z deleted: sha256:3e117f49d0cf9c39a6760c2f2b38f633637721bd32159d05e2f49127555d82bf 2025-08-14T23:01:06.6729774Z deleted: sha256:061415cdda961b7b3689417fe9d39c86befe527d0896bd250b62340bb5b27528 2025-08-14T23:01:06.6730363Z deleted: sha256:8bfab6cd3e0acc182171b9d44991bb606b034614c21416c8245c4759aebf5ee3 2025-08-14T23:01:06.6730946Z deleted: sha256:65a0ee625807f0ebc93b67064c8e706c3767e360461330bbe8d4bf14fea9c92d 2025-08-14T23:01:06.6731567Z deleted: sha256:395493ae36126133bf26ed55feee3c5ae241c30769d33af5175a64adab19938d 2025-08-14T23:01:06.6732152Z deleted: sha256:1fef2b9dda18b3bbbd78304d4193f9d786ad8778219e26c99dd6b5937575545a 2025-08-14T23:01:06.6732919Z deleted: sha256:365eaf31db7a1af5413508b38dead08f6c4cc1fcfc1dbe41f24d3d711f7ea6c3 2025-08-14T23:01:06.6733529Z deleted: sha256:911f2ddd021c84e00d3fbca1982dccabbfdb2dd8cd9487225de26dcfcfc1f3d1 2025-08-14T23:01:06.6734123Z deleted: sha256:4d1a5ca4c7087288e537ec78d57ca68d8e41b9ae8a2fffb439d9d9ee293aa9aa 2025-08-14T23:01:06.6734712Z deleted: sha256:283f379c6f6c891704cb4d2f9206776685fabe9ef880a180bf0f9a498eeaa6cb 2025-08-14T23:01:06.6735291Z deleted: sha256:95f30000c9142304cef4c4a369348899bfebf3d34a981e732ec51fd058c4b159 2025-08-14T23:01:06.6735870Z deleted: sha256:02734d77ee0381fce1fa6c90578bee1f2f43d2259cfab478b55921e2609d57da 2025-08-14T23:01:06.6736465Z deleted: sha256:235bcd58fea87acb602862382f31bb33574d79d3835c14dc0ef8aaccfad5ebf8 2025-08-14T23:01:06.6737051Z deleted: sha256:6b8760c007840be7c34ae6b47fc6fd522af5854da23e651ad5087175b7d3bf82 2025-08-14T23:01:06.6737638Z deleted: sha256:abf6ae05ec64a631075b15e3f6456a25a332025c7a65ce25bdac3bce114a6a36 2025-08-14T23:01:06.6738215Z deleted: sha256:d75025672a419cd6b6da7d8431512d489d1259d38a8efd6e7ea5fb6147db5022 2025-08-14T23:01:06.6738792Z deleted: sha256:192ab86bff6f5028045b194973f195ac36d77bbff2d0210335343dad58ce2e61 2025-08-14T23:01:06.6739367Z deleted: sha256:91a836176279d148938410ae5a56e2fff555933572f9ed994b8df5c2ee28d592 2025-08-14T23:01:06.6739920Z deleted: sha256:050626533c94618be639c899b99017d77e939297f472ea66955f90ca4267b929 2025-08-14T23:01:06.6740546Z deleted: sha256:93657d5f522e0cfe9c7a1b9824f0747918c7ab9594fb5e97fa64be9eef33f4ee 2025-08-14T23:01:06.6741159Z deleted: sha256:6a2492541d7dce8235005197a6424c59e6821e02e94654e5901f52cea63fbf7e 2025-08-14T23:01:06.6741887Z deleted: sha256:f692695fcdb5eba8c7ffe4cef01a6db8f910ce84f40e364a2e4405b977893ea6 2025-08-14T23:01:06.6742471Z deleted: sha256:10d11bf04d6dc7d510105333387db057a428e4cb6820b2952da7812946c043ed 2025-08-14T23:01:06.6743050Z deleted: sha256:a6a885c0244be22a19aa807cab263bbc0b36c72bef20e55486b7a5fc2214cd8d 2025-08-14T23:01:06.6743833Z deleted: sha256:ed788d7d9ca7e6a59df78a71d06136665ea087b1650afe1d563cd2b2002b3ed5 2025-08-14T23:01:06.6744661Z deleted: sha256:eda34df392aeb4f60e8f7d6a91c383942579b69fe7e213100cbbaee3a3a946c3 2025-08-14T23:01:06.6745245Z deleted: sha256:40957079bb22006ef598be517762f28bc08c90719961025902ffd01fc738386f 2025-08-14T23:01:06.6745821Z deleted: sha256:90a2bf02e851326fc70d05470553ed33e578342d6e06bfa0cfaf331c4079b7e4 2025-08-14T23:01:06.6746314Z untagged: public.ecr.aws/docker/library/python:3.13 2025-08-14T23:01:06.6746962Z untagged: public.ecr.aws/docker/library/python@sha256:50cbf8e58ca53a806b99250b1ba2d16c19433e8c42e7eb4ac4ea924b095e280b 2025-08-14T23:01:06.6748167Z deleted: sha256:51d2e96efc8fe0838f414c149b869ca6900d94dbe656ff27fe5c1302cf37e862 2025-08-14T23:01:06.6748764Z deleted: sha256:c1bb2eb73491df8bcc15d05b6059964a2a824a5be52827693d7bc24968f5c25b 2025-08-14T23:01:06.6749355Z deleted: sha256:d5213d9bf3afbfb2aef04c838094292dc6b52541bf9e55d21db911d7ae151f60 2025-08-14T23:01:06.6749944Z deleted: sha256:0d06cf036e638a073ebfb88fbda537a30dd48c28c24216cccc7c22e46e50a6e3 2025-08-14T23:01:06.6750528Z deleted: sha256:4e65c0510f8c22592464db0bb373e99224184267eec9f996c5de50ed7d194fad 2025-08-14T23:01:06.6751110Z deleted: sha256:e0e35c74660cb2629bda541aedaf15401e3287a0947fb2e97f04e0351a3acc3b 2025-08-14T23:01:06.6751700Z deleted: sha256:fee0f7166a1ec6858714827effcfc0757f33d255ca88bde2143984fc3d80b0d2 2025-08-14T23:01:06.6752293Z deleted: sha256:cb9eb84282d037ad85b02cf671ef6ca766c43bc957f88b048d16dd6deb6e68b8 2025-08-14T23:01:06.6752652Z 2025-08-14T23:01:06.6752767Z Total reclaimed space: 37.07GB 2025-08-14T23:01:06.6862054Z Post job cleanup. 2025-08-14T23:01:06.6911560Z Post job cleanup. 2025-08-14T23:01:06.7969625Z [command]/usr/bin/git version 2025-08-14T23:01:06.8032815Z git version 2.47.1 2025-08-14T23:01:06.8075459Z Copying '/home/ec2-user/.gitconfig' to '/home/ec2-user/actions-runner/_work/_temp/ff2b322b-07bc-460d-9715-42d13375530d/.gitconfig' 2025-08-14T23:01:06.8087029Z Temporarily overriding HOME='/home/ec2-user/actions-runner/_work/_temp/ff2b322b-07bc-460d-9715-42d13375530d' before making global git config changes 2025-08-14T23:01:06.8088407Z Adding repository directory to the temporary git global config as a safe directory 2025-08-14T23:01:06.8093760Z [command]/usr/bin/git config --global --add safe.directory /home/ec2-user/actions-runner/_work/pytorch/pytorch 2025-08-14T23:01:06.8142168Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-08-14T23:01:06.8190795Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-08-14T23:01:06.8615729Z Entering 'android/libs/fbjni' 2025-08-14T23:01:06.8696870Z Entering 'third_party/FP16' 2025-08-14T23:01:06.8778204Z Entering 'third_party/FXdiv' 2025-08-14T23:01:06.8864938Z Entering 'third_party/NNPACK' 2025-08-14T23:01:06.8951888Z Entering 'third_party/NVTX' 2025-08-14T23:01:06.9033369Z Entering 'third_party/VulkanMemoryAllocator' 2025-08-14T23:01:06.9114698Z Entering 'third_party/XNNPACK' 2025-08-14T23:01:06.9212043Z Entering 'third_party/aiter' 2025-08-14T23:01:06.9294922Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-08-14T23:01:06.9386455Z Entering 'third_party/benchmark' 2025-08-14T23:01:06.9467858Z Entering 'third_party/composable_kernel' 2025-08-14T23:01:06.9558171Z Entering 'third_party/cpp-httplib' 2025-08-14T23:01:06.9638284Z Entering 'third_party/cpuinfo' 2025-08-14T23:01:06.9720720Z Entering 'third_party/cudnn_frontend' 2025-08-14T23:01:06.9801912Z Entering 'third_party/cutlass' 2025-08-14T23:01:06.9895946Z Entering 'third_party/fbgemm' 2025-08-14T23:01:06.9978673Z Entering 'third_party/fbgemm/external/asmjit' 2025-08-14T23:01:07.0056885Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-08-14T23:01:07.0140741Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-08-14T23:01:07.0217612Z Entering 'third_party/fbgemm/external/cutlass' 2025-08-14T23:01:07.0307255Z Entering 'third_party/fbgemm/external/googletest' 2025-08-14T23:01:07.0384542Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-08-14T23:01:07.0463166Z Entering 'third_party/fbgemm/external/json' 2025-08-14T23:01:07.0546941Z Entering 'third_party/flash-attention' 2025-08-14T23:01:07.0627775Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-08-14T23:01:07.0713714Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-08-14T23:01:07.0804409Z Entering 'third_party/flatbuffers' 2025-08-14T23:01:07.0889201Z Entering 'third_party/fmt' 2025-08-14T23:01:07.0971151Z Entering 'third_party/gemmlowp/gemmlowp' 2025-08-14T23:01:07.1052755Z Entering 'third_party/gloo' 2025-08-14T23:01:07.1132971Z Entering 'third_party/googletest' 2025-08-14T23:01:07.1214034Z Entering 'third_party/ideep' 2025-08-14T23:01:07.1292736Z Entering 'third_party/ideep/mkl-dnn' 2025-08-14T23:01:07.1378677Z Entering 'third_party/ittapi' 2025-08-14T23:01:07.1460695Z Entering 'third_party/kineto' 2025-08-14T23:01:07.1539872Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-08-14T23:01:07.1615470Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-08-14T23:01:07.1696908Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-08-14T23:01:07.1775342Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-08-14T23:01:07.1854895Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-08-14T23:01:07.1931685Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-08-14T23:01:07.2016627Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-08-14T23:01:07.2093971Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-08-14T23:01:07.2173930Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-08-14T23:01:07.2253304Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-08-14T23:01:07.2335256Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-08-14T23:01:07.2414037Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-08-14T23:01:07.2496267Z Entering 'third_party/kleidiai' 2025-08-14T23:01:07.2578209Z Entering 'third_party/mimalloc' 2025-08-14T23:01:07.2658933Z Entering 'third_party/nlohmann' 2025-08-14T23:01:07.2740859Z Entering 'third_party/onnx' 2025-08-14T23:01:07.2842783Z Entering 'third_party/onnx/third_party/pybind11' 2025-08-14T23:01:07.2936822Z Entering 'third_party/opentelemetry-cpp' 2025-08-14T23:01:07.3021447Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-08-14T23:01:07.3099906Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-08-14T23:01:07.3177609Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-08-14T23:01:07.3254578Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-08-14T23:01:07.3331252Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-08-14T23:01:07.3407490Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-08-14T23:01:07.3487383Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-08-14T23:01:07.3564102Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-08-14T23:01:07.3644708Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-08-14T23:01:07.3728384Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-08-14T23:01:07.3829754Z Entering 'third_party/pocketfft' 2025-08-14T23:01:07.3909979Z Entering 'third_party/protobuf' 2025-08-14T23:01:07.3994154Z Entering 'third_party/protobuf/third_party/benchmark' 2025-08-14T23:01:07.4074055Z Entering 'third_party/protobuf/third_party/googletest' 2025-08-14T23:01:07.4156713Z Entering 'third_party/psimd' 2025-08-14T23:01:07.4237366Z Entering 'third_party/pthreadpool' 2025-08-14T23:01:07.4319115Z Entering 'third_party/pybind11' 2025-08-14T23:01:07.4400916Z Entering 'third_party/python-peachpy' 2025-08-14T23:01:07.4480606Z Entering 'third_party/sleef' 2025-08-14T23:01:07.4563656Z Entering 'third_party/tensorpipe' 2025-08-14T23:01:07.4641804Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-08-14T23:01:07.4719300Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-08-14T23:01:07.4796375Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-08-14T23:01:07.4876607Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-08-14T23:01:07.4952325Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-08-14T23:01:07.5063465Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-08-14T23:01:07.5091522Z http.https://github.com/.extraheader 2025-08-14T23:01:07.5104750Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2025-08-14T23:01:07.5144238Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-08-14T23:01:07.5552921Z Entering 'android/libs/fbjni' 2025-08-14T23:01:07.5609385Z http.https://github.com/.extraheader 2025-08-14T23:01:07.5660215Z Entering 'third_party/FP16' 2025-08-14T23:01:07.5713427Z http.https://github.com/.extraheader 2025-08-14T23:01:07.5765085Z Entering 'third_party/FXdiv' 2025-08-14T23:01:07.5821862Z http.https://github.com/.extraheader 2025-08-14T23:01:07.5873537Z Entering 'third_party/NNPACK' 2025-08-14T23:01:07.5925896Z http.https://github.com/.extraheader 2025-08-14T23:01:07.5975898Z Entering 'third_party/NVTX' 2025-08-14T23:01:07.6028073Z http.https://github.com/.extraheader 2025-08-14T23:01:07.6079432Z Entering 'third_party/VulkanMemoryAllocator' 2025-08-14T23:01:07.6132849Z http.https://github.com/.extraheader 2025-08-14T23:01:07.6185188Z Entering 'third_party/XNNPACK' 2025-08-14T23:01:07.6244773Z http.https://github.com/.extraheader 2025-08-14T23:01:07.6313102Z Entering 'third_party/aiter' 2025-08-14T23:01:07.6366678Z http.https://github.com/.extraheader 2025-08-14T23:01:07.6416573Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-08-14T23:01:07.6467502Z http.https://github.com/.extraheader 2025-08-14T23:01:07.6528876Z Entering 'third_party/benchmark' 2025-08-14T23:01:07.6580009Z http.https://github.com/.extraheader 2025-08-14T23:01:07.6632635Z Entering 'third_party/composable_kernel' 2025-08-14T23:01:07.6684655Z http.https://github.com/.extraheader 2025-08-14T23:01:07.6744620Z Entering 'third_party/cpp-httplib' 2025-08-14T23:01:07.6796141Z http.https://github.com/.extraheader 2025-08-14T23:01:07.6846083Z Entering 'third_party/cpuinfo' 2025-08-14T23:01:07.6897273Z http.https://github.com/.extraheader 2025-08-14T23:01:07.6949613Z Entering 'third_party/cudnn_frontend' 2025-08-14T23:01:07.6999011Z http.https://github.com/.extraheader 2025-08-14T23:01:07.7048755Z Entering 'third_party/cutlass' 2025-08-14T23:01:07.7102835Z http.https://github.com/.extraheader 2025-08-14T23:01:07.7165738Z Entering 'third_party/fbgemm' 2025-08-14T23:01:07.7223176Z http.https://github.com/.extraheader 2025-08-14T23:01:07.7276274Z Entering 'third_party/fbgemm/external/asmjit' 2025-08-14T23:01:07.7326424Z http.https://github.com/.extraheader 2025-08-14T23:01:07.7376331Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-08-14T23:01:07.7426277Z http.https://github.com/.extraheader 2025-08-14T23:01:07.7484356Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-08-14T23:01:07.7534214Z http.https://github.com/.extraheader 2025-08-14T23:01:07.7583796Z Entering 'third_party/fbgemm/external/cutlass' 2025-08-14T23:01:07.7633704Z http.https://github.com/.extraheader 2025-08-14T23:01:07.7692323Z Entering 'third_party/fbgemm/external/googletest' 2025-08-14T23:01:07.7744226Z http.https://github.com/.extraheader 2025-08-14T23:01:07.7794492Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-08-14T23:01:07.7844828Z http.https://github.com/.extraheader 2025-08-14T23:01:07.7895353Z Entering 'third_party/fbgemm/external/json' 2025-08-14T23:01:07.7945122Z http.https://github.com/.extraheader 2025-08-14T23:01:07.8001332Z Entering 'third_party/flash-attention' 2025-08-14T23:01:07.8053757Z http.https://github.com/.extraheader 2025-08-14T23:01:07.8102699Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-08-14T23:01:07.8153126Z http.https://github.com/.extraheader 2025-08-14T23:01:07.8210996Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-08-14T23:01:07.8262600Z http.https://github.com/.extraheader 2025-08-14T23:01:07.8327422Z Entering 'third_party/flatbuffers' 2025-08-14T23:01:07.8379867Z http.https://github.com/.extraheader 2025-08-14T23:01:07.8449328Z Entering 'third_party/fmt' 2025-08-14T23:01:07.8495466Z http.https://github.com/.extraheader 2025-08-14T23:01:07.8550580Z Entering 'third_party/gemmlowp/gemmlowp' 2025-08-14T23:01:07.8603522Z http.https://github.com/.extraheader 2025-08-14T23:01:07.8655399Z Entering 'third_party/gloo' 2025-08-14T23:01:07.8707277Z http.https://github.com/.extraheader 2025-08-14T23:01:07.8758071Z Entering 'third_party/googletest' 2025-08-14T23:01:07.8810731Z http.https://github.com/.extraheader 2025-08-14T23:01:07.8861622Z Entering 'third_party/ideep' 2025-08-14T23:01:07.8914653Z http.https://github.com/.extraheader 2025-08-14T23:01:07.8963657Z Entering 'third_party/ideep/mkl-dnn' 2025-08-14T23:01:07.9013807Z http.https://github.com/.extraheader 2025-08-14T23:01:07.9073327Z Entering 'third_party/ittapi' 2025-08-14T23:01:07.9125498Z http.https://github.com/.extraheader 2025-08-14T23:01:07.9176349Z Entering 'third_party/kineto' 2025-08-14T23:01:07.9228854Z http.https://github.com/.extraheader 2025-08-14T23:01:07.9279900Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-08-14T23:01:07.9330604Z http.https://github.com/.extraheader 2025-08-14T23:01:07.9380697Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-08-14T23:01:07.9431328Z http.https://github.com/.extraheader 2025-08-14T23:01:07.9487688Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-08-14T23:01:07.9538654Z http.https://github.com/.extraheader 2025-08-14T23:01:07.9589858Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-08-14T23:01:07.9642250Z http.https://github.com/.extraheader 2025-08-14T23:01:07.9693233Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-08-14T23:01:07.9743804Z http.https://github.com/.extraheader 2025-08-14T23:01:07.9792768Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-08-14T23:01:07.9848348Z http.https://github.com/.extraheader 2025-08-14T23:01:07.9903598Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-08-14T23:01:07.9954681Z http.https://github.com/.extraheader 2025-08-14T23:01:08.0004935Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-08-14T23:01:08.0055688Z http.https://github.com/.extraheader 2025-08-14T23:01:08.0107135Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-08-14T23:01:08.0158474Z http.https://github.com/.extraheader 2025-08-14T23:01:08.0210383Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-08-14T23:01:08.0261364Z http.https://github.com/.extraheader 2025-08-14T23:01:08.0317538Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-08-14T23:01:08.0369831Z http.https://github.com/.extraheader 2025-08-14T23:01:08.0420363Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-08-14T23:01:08.0470935Z http.https://github.com/.extraheader 2025-08-14T23:01:08.0524270Z Entering 'third_party/kleidiai' 2025-08-14T23:01:08.0577890Z http.https://github.com/.extraheader 2025-08-14T23:01:08.0627623Z Entering 'third_party/mimalloc' 2025-08-14T23:01:08.0680970Z http.https://github.com/.extraheader 2025-08-14T23:01:08.0732391Z Entering 'third_party/nlohmann' 2025-08-14T23:01:08.0784870Z http.https://github.com/.extraheader 2025-08-14T23:01:08.0836873Z Entering 'third_party/onnx' 2025-08-14T23:01:08.0890767Z http.https://github.com/.extraheader 2025-08-14T23:01:08.0956761Z Entering 'third_party/onnx/third_party/pybind11' 2025-08-14T23:01:08.1008178Z http.https://github.com/.extraheader 2025-08-14T23:01:08.1063342Z Entering 'third_party/opentelemetry-cpp' 2025-08-14T23:01:08.1115844Z http.https://github.com/.extraheader 2025-08-14T23:01:08.1168547Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-08-14T23:01:08.1223080Z http.https://github.com/.extraheader 2025-08-14T23:01:08.1273717Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-08-14T23:01:08.1324287Z http.https://github.com/.extraheader 2025-08-14T23:01:08.1373760Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-08-14T23:01:08.1423607Z http.https://github.com/.extraheader 2025-08-14T23:01:08.1473502Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-08-14T23:01:08.1523917Z http.https://github.com/.extraheader 2025-08-14T23:01:08.1576142Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-08-14T23:01:08.1626287Z http.https://github.com/.extraheader 2025-08-14T23:01:08.1674981Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-08-14T23:01:08.1724858Z http.https://github.com/.extraheader 2025-08-14T23:01:08.1774358Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-08-14T23:01:08.1824988Z http.https://github.com/.extraheader 2025-08-14T23:01:08.1873751Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-08-14T23:01:08.1923534Z http.https://github.com/.extraheader 2025-08-14T23:01:08.1977434Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-08-14T23:01:08.2026782Z http.https://github.com/.extraheader 2025-08-14T23:01:08.2081194Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-08-14T23:01:08.2130987Z http.https://github.com/.extraheader 2025-08-14T23:01:08.2204449Z Entering 'third_party/pocketfft' 2025-08-14T23:01:08.2256579Z http.https://github.com/.extraheader 2025-08-14T23:01:08.2306540Z Entering 'third_party/protobuf' 2025-08-14T23:01:08.2362596Z http.https://github.com/.extraheader 2025-08-14T23:01:08.2415079Z Entering 'third_party/protobuf/third_party/benchmark' 2025-08-14T23:01:08.2465760Z http.https://github.com/.extraheader 2025-08-14T23:01:08.2515374Z Entering 'third_party/protobuf/third_party/googletest' 2025-08-14T23:01:08.2566778Z http.https://github.com/.extraheader 2025-08-14T23:01:08.2620157Z Entering 'third_party/psimd' 2025-08-14T23:01:08.2672648Z http.https://github.com/.extraheader 2025-08-14T23:01:08.2723120Z Entering 'third_party/pthreadpool' 2025-08-14T23:01:08.2774772Z http.https://github.com/.extraheader 2025-08-14T23:01:08.2825031Z Entering 'third_party/pybind11' 2025-08-14T23:01:08.2876502Z http.https://github.com/.extraheader 2025-08-14T23:01:08.2927660Z Entering 'third_party/python-peachpy' 2025-08-14T23:01:08.2979832Z http.https://github.com/.extraheader 2025-08-14T23:01:08.3029851Z Entering 'third_party/sleef' 2025-08-14T23:01:08.3081892Z http.https://github.com/.extraheader 2025-08-14T23:01:08.3135677Z Entering 'third_party/tensorpipe' 2025-08-14T23:01:08.3191286Z http.https://github.com/.extraheader 2025-08-14T23:01:08.3240034Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-08-14T23:01:08.3290944Z http.https://github.com/.extraheader 2025-08-14T23:01:08.3341262Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-08-14T23:01:08.3392754Z http.https://github.com/.extraheader 2025-08-14T23:01:08.3444112Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-08-14T23:01:08.3494762Z http.https://github.com/.extraheader 2025-08-14T23:01:08.3544699Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-08-14T23:01:08.3596156Z http.https://github.com/.extraheader 2025-08-14T23:01:08.3645294Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-08-14T23:01:08.3697457Z http.https://github.com/.extraheader 2025-08-14T23:01:08.3871694Z A job completed hook has been configured by the self-hosted runner administrator 2025-08-14T23:01:08.3903779Z ##[group]Run '/home/ec2-user/runner-scripts/after_job.sh' 2025-08-14T23:01:08.3911910Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-08-14T23:01:08.3912270Z ##[endgroup] 2025-08-14T23:01:16.4178970Z Cleaning up orphan processes